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Part I 

Preamble 




Motivation 


Since its invention in 1982 by Teuvo Kohonen, the study of self-organzing maps 
has become a vast field of research in computer science. In more than 7000 pub¬ 
lications based on the SOM methods (cf.[KKK],[OKK98]) many properties as well 
as practical applications of the algorithm have been examined and presented and 
the progress still goes on. This thesis contributes to it by focusing on one already 
encountered as well as on one new aspect of the SOM that shall be motivated in the 
following sections. 


The three "travelling ruler problems" 


The possibility to tackle the famous travelling salesman problem (TSP) by using Ko- 
honen's maps was for the first time discovered by Durbin and Willshaw [DW87], 
They used a closed ring of neurons to represent the (shortest) path for the sales¬ 
man. By adapting the Self-organizing map based on the position of the cities, they 
obtained a good approximation of the solution of the problem even if it is not guar¬ 
anteed that the global minimum for the path length is always found. However this 
method has been always restricted to the case where the cities lie in an Euclidean 
plane. Thus the question arose for us whether the classical SOM can be generalized 
in a way to also solve the case where he cities may lie on more general surfaces or 
spaces. The following three examples named the "travelling ruler problems" shall 
briefly illustrate what is meant by that. 



FIGURE 1: Traveling ruler problems: (left) flat (middle) spherical (right) hyperbolic space 


Emperor Frederick I Barbarossa 

In the year of our lord 1189, Frederick I Barbarossa is about to prepare his crusade 
to the holy land. He plans to visit at least several cities on his way. They are been 
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shown in the left map in Fig.l. Starting in Frankfurt where he has been elected and 
crowned many years before he wants to meet the pope in Rome, accompany his 
son to Hungary where he will get engaged with a local princess, cross the strait be¬ 
tween Europe and Asia at Gallipoli and finally "liberate" the holy city of Jerusalem. 
According to the doctrine of the church, this medieval world is flat. So, Frederick 
could just plan his travel by using a map and a ruler (at least if he wants to take the 
straight route ignoring any obstacles). 


Pres. Barack Obama 

More than 800 years later, the newly-elected US-President Barack Obama is about 
to make his first official trip abroad. He wants to meet his European allies as well as 
inspect several "conflict zones" of the past and present, namely Russia, China and 
the Middle East. Furthermore he plans to take the opportunity to also visit some 
relatives in Nairobi, Xenia. Unlike the medieval world of Barbarossa his world is 
no longer flat because in the meantime an Italian guy named Gallilei (re)discovered 
that the surface of the world is actually curved like a sphere. Thus in order to find 
now the shortest flight route for the "Air Force One" this has to be taken into ac¬ 
count. 


President Kse'nu 

Again a few hundred years later the (fictive) President Kse'nu of the Galactic Con¬ 
federacy encounters his "travelling spaceman problem" as he wants to visit his do¬ 
minions that are located in several galaxies. Since scientists have finally proven 
today's conjecture that the space is in fact also globally curved, but this time hy- 
perbolically, this non-Euclidean geometry affects the route that he has to pick to 
minimize the flight path for his pan-galactic cruiser. 

While the problem of Barbarossa in the Euclidean world matches the case formerly 
discussed by Durbin and Willshaw, the classical SOM does not provide a way to 
approach the latter two. Our first goal shall therefore be to extend the SOM algo¬ 
rithm, in order to handle non-Euclidean or even more general feature spaces, and 
to present an implementation of this General Riemannian SOM (GRiSOM) model. 


Stability analysis 


A second field of interest of the work concerning this thesis was to verify and extend 
the analytic and numerical analysis of the SOM that was formerly done by Ritter 
and Schulten [RS88]. For the numerical case we are mostly interested how the sta¬ 
bility limits are affected if we use other regular maps. In the numerical approach we 
then want to verify the analytic results and in addition to that examine the stabil¬ 
ity properties of several configurations using a HSOM and even the newly-defined 
GRiSOM. The latter case is, in particular, very interesting, as recently published pa¬ 
pers like one of Tran and Vu[TV08] discussed the ability to apply dimensionality 
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reduction techniques in Hyperbolic data spaces and the corresponding reconstruc¬ 
tion of the data and the GRiSOM naturally provide a way to perform this task under 
the given circumstances. 




Overview / Reader's Guide 


Based on the given motivation this thesis focuses on two topics. First of all, we want 
to extend the definition of the classic SOM to obtain the GRiSOM and, secondly, we 
are going to focus on the analytic and numerical analysis of the stability limits of 
several SOM configurations. The first goal can be achieved quite straightforwardly 
and briefly as the specification is kept as abstract as possible and therefore not much 
previous knowledge is needed. The second goal we are striving for, however, needs 
much more preparatory work since we have to specify concretely all the compo¬ 
nents required for the simulation of the GRiSOM. This thesis will hereby try to give 
a self-contained view on all the subjects we have to deal with. Even with having 
banished the lesser instructive calculations and results to the appendix, the com¬ 
plexity of the mainpart is therefore still rather high since e.g. some longish deriva¬ 
tions have been kept therein as they provide a deeper, essential insight into the 
currently handled matter. This shall enable the more interested readers to compre¬ 
hend the detailed train of thoughts and furthermore provide detailed guidelines on 
how to tackle the problems we faced there for future research in this field. 

While some readers, who are interested in every detail of the work, are benefited 
by this style of presentation, the "readers in a hurry" on the other hand may be 
hindered as it is harder to distinguish the less important deviations and the more 
important intermediate and final results when skim-reading is used on this large 
volume of text. To support as well this type of readers without tearing the structure 
of the text apart (which would result in clouding the thread for the more inter¬ 
ested reader) we asterisked those sections of the text, which can be skipped as they 
present only less important deviations or intermediate results that are mostly of 
importance in the local scope or just provide detailed insights that are not needed 
to follow the main thread. If, nonetheless, some formulas of these parts have to be 
used in more important sections they are cross-referenced and thus easily retriev¬ 
able. 

Furthermore does each chapter begin with a brief introductory paragraph where 
the content and its structure are motivated and which provides a more detailed ori¬ 
entation for both more interested and skim readers. And, finally, to give an overall 
view, a brief presentation and motivation of the chosen structure of the whole thesis 
is offered in the following. 

As indicated above, the thesis starts with a chapter wherein we introduce the newly- 
defined General Riemannian Self-organizing map. We therefore discuss briefly the clas¬ 
sic SOM to have then a look at the intrinsic prerequisite, that the SOM algorithm 
imposes on the used map and feature spaces. This will then enable us to find a nat¬ 
ural modification of the SOM algorithm which finally lead to the specification of 
the GRiSOM. 
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As the definition of the GRiSOM in chapter 1 is kept quite abstract, we have now to 
concern ourselves with some preparatory work to specify concrete GRiSOM mod¬ 
els before we can perform any real simulations e.g. to examine the stability 

First of all, we have to take a look at the map and feature spaces that we want 
to use. Ch.2 therefore focus on the geometrical properties of certain Euclidean and 
non-Euclidean spaces. We will thereby have a brief look at the axioms defining these 
spaces in general and are going to introduce subsequently several concrete models 
of these spaces that we later want to use in our simulations and calculations. Sev¬ 
eral structures, such as the geodesics and isometries will also be derived in detail. 

Given a certain map space, we will have then to decide how to embed the neu¬ 
rons, which form the map, in it. We will therefor study in chapter 3 the possible 
regular tessellation of the hyperbolic and Euclidean map space and have to occupy 
ourselves with the question on how to generate these tilings. 

Chapter 4 now raises the issue of the generation of appropriate sample sets for 
the stability analysis. We will first loosely define the needed distributions and then 
focus in the second part on deriving mathematically the algorithms needed to gen¬ 
erate samples that realize these previously defined distributions. 

The preparatory work is concluded by presenting a concrete implementation of the 
GRiSOM, the derived spaces, maps and distributions. We will briefly discuss the 
design and most important features of the software bundle that has been written in 
the course of this thesis. 

Having finished the discussion of the mathematical background and the imple¬ 
mentation needed to construct concrete SOMs, we will be finally able to concern 
ourselves with the dynamics and, in particular, the stability of certain equilibri¬ 
ums states of the SOM. This will be done by using both analytic and numerical 
approaches. 

First we will focus on the analytical analysis, confining the calculations in chap¬ 
ter 6 to regular Euclidean maps and Euclidean feature spaces. We will deduce the 
Fokker-Planck equation for the corresponding stochastic process and use it then to 
determine the stability limit for the particular SOM configurations. 

After having tackled the analytic approach, we are going to verify and further¬ 
more extend these results with numerical means i.e. the Monte Carlo method by 
using the implemented GRiSOM in combination with the sample sets whose dis¬ 
tributions we will have discussed in chapter 4. We will start by discussing how the 
stability limits can be determined by evaluating the data produced by this simula¬ 
tion. We will then focus on verifying the results obtained by the analytic approach 
using an classic SOM setting with regular Euclidean maps. We will then continue 
with the numerical stability analysis of a HSOM, which has a hyperbolic map space, 
and finally are going to try to determine some aspects of the stability behavior of a 
GRiSOM with both hyperbolic map and feature space. The numerical simulations 
and therefore the whole analysis part will be concluded by returning to the "Trav¬ 
elling ruler problems" that we used as our primary motivation that resulted in the 
GRiSOM. We will thereby present the results which are obtained when performing 
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the numerical simulation using the GRiSOM with the proper feature spaces. 

The final chapter of the thesis is reserved for a conclusion of our work. We will 
briefly take a final look at the results we obtained in the stability analysis, discuss 
problems we have encountered and finally outline some possible goal for future 
work on this subject. 

As noted above, the appendix finally contains proofs, calculations, data and fig¬ 
ures that were omitted before as they would have bloated the chapters and/or dis¬ 
tract the reader from more important calculations and results. This includes e.g. the 
elementary geometric calculations and the derivation of several equations needed 
in the analytic stability analysis. Beside this, an installation guide for the software 
bundle can also be found there. 




Part II 

GRiSOM 




Chapter 1 

A Generalized View on Self Organizing 
Maps 


This first chapter is dedicated to the definition of the General Riemannian SOM which 
is a modification of the classic self-organizing maps, formerly introduced by Teuvo 
Kohonen [Koh82], to obtain a unifying model for SOMs for both more general map 
and input spaces. We therefore briefly present the original definition of the SOM 
followed by a detailed discussion of modifications we made that lead to GRiSOM. 


1.1 Self-organizing map 


In general, the self-organizing maps (SOM) are a class of artificial neural networks, 
which is based on competitive learning on a fully-connected set of artificial neurons 
(cf. Fig.2). 



FIGURE 2: formal Kohonen feature map with 2-dimensional input and 3x3 map 

It represents a topology-preserving generalization of Vector Quantization^ Q) and is 
used for many purposes including visualization of high-dimensional data, classifi¬ 
cation/ clustering and automated feature extraction. 

The basic idea is to place the artificial neurons at the vertices of a more dimensional 
lattice which defines a topological structure on the neurons. By taking their synaptic 
weight vectors, they are furthermore mapped to specific locations in the data, or as 
we will call it henceforth, feature space F. The (output) neurons now compete among 
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themselves and only the winning neuron is fired for any input/sample ("winner- 
takes-all principle"). In other words, the input patterns in the feature space (of ar¬ 
bitrary dimension) are transformed in a (lesser dimensional) discrete map (—> di¬ 
mension reduction) as every input is mapped to a unique winning neuron, but, in 
contrast to VQ, in a topological ordered fashion. 

The whole process of this transformation can be reduced to the following sub¬ 
processes: 

Competition: As mentioned above, the neurons compete among themselves. For 
this purpose, each neuron compute a value using a discriminant function 
which calculates the response of the particular neuron to the input. The neu¬ 
ron belonging to the largest response is then the winner. 

To be more precise: Given a SOM with a set of neurons A and a (n-dimensional) 
input sample v and let furthermore Wj be the synaptic weight vector (and 
therefore its location in the feature space) of neuron j and disc(-) a discrim¬ 
inant function, then the winning neurons i(v) belonging to input v is deter¬ 
mined by: 

i(v) = arg max (disc (v,zvA) 
jeA 

Due to the neurobiological motivation that led to the SOMs, an inner product 
v ■ Wj is typically used to compute the neural response i.e. as the discriminant 
function. The resulting maximum problem can then be substituted by finding 
the minimal Euclidean distance: 

i(v) = aremin llu — wAl 

jeA 11 

where || ■ || is the canonical Euclidean norm. As each possible point in the fea¬ 
ture space is mapped to the neuron which is located next to v in F, we get a 
Voronoi tessellation of the feature space according to its metric (cf. chapter 3). 
Each neuron is then a representative of a whole set of points, namely all the 
points lying in the corresponding voronoi cell. 

Cooperation: Unlike in VQ, the winning neuron is cooperating with other neurons 
lying in a certain topological neighborhood i.e. it is enlarging or inhibiting 
the response of these neighbors. Given two neurons i and j, the degree of 
cooperation is thereby determined by a (unimodal) neighborhood function h^j, 
which typically depends only on the distance between the two neurons in 
the lattice of the map. There exists many different choices for hi r j and we are 
going to discuss three (classes of) possible functions in the last section of this 
chapter. 

Synaptic Adaption: All neurons in the topological neighborhood of the winner 
neuron are adapted, such that the response of them to similar input pattern is 
enhanced. This is achieved by applying a simplified form of Hebbian learn¬ 
ing, which uses in general the adaption rule for the location of neuron j in the 
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input space: 

A Wj = tjyjv-f{yj)wj 

where ?/ is an adaption parameter called learning rate, yj is the response of neu¬ 
ron j and f(ijj)Wj a forgetting term. We simplify this equation by substituting 
the response by the neighborhood function and the forgetting term by rjh^i. 
Hence we obtain (v input and i(v) again winner neuron) 

A Wj = Tjhjty'jiv - Wj) 

which is now used for the adaption of the neurons in the SOM. Thus, an adap¬ 
tion result in a move of the winner neuron and (at a lesser rate) of its neighbors 
(along an Euclidean line) in the feature space towards the input. The (partial) 
move of the neighbors as well ensures thereby that the topology defined by 
the lattice of the map tends to be preserved. 


1.1.1 The SOM Algorithm 

Combing the three processes discussed above, we can formulate the SOM algo¬ 
rithm: 

Step 1: (Initialization) Given a set of neurons A, fix the position of these neurons in 
the map and choose suitable initial locations for them in the feature space F 
(e.g. by random). 

Step 2: (Sampling) Draw an input sample v in F according to sensory input or a pre¬ 
specified probability distribution. 

Step 3: (Matching) Determine winner neuron s := i(v) which is the neuron that 
matches v best i.e. has the nearest representatives to the location of the sample 
in the feature space: 

||rT s — ull = min II w r — Ell 

reA 

Step 4: (Updating) Adjust locations of neurons in feature space according to rule 
found above for the process of the Synaptic Adaption: 

zvf ew = zv° ld + e ■ h r/S ■ (v — w° ld ) for all reA (1.1) 

Step 5: return to Step Step 2: until any predefined break conditions are met (e.g. reach¬ 
ing maximal number of adaption steps or falling below a certain representa¬ 
tion error) 


1.1.2 Dimensionality reduction/conflict 

We conclude the discussion of the classic SOM by briefly discussing on one of its 
fields of application, namely the dimension reduction. As mentioned above, the 
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SOM tries to embed itself into a given input set while preserving the topologic struc¬ 
ture of the set of nodes of the map space also in the feature space. We thus obtain 
a mapping of the input onto the discrete lattice of a certain dimension reducing the 
dimension of the feature space to the dimension of the map space. If the intrinsic 
dimension of the input embedded in the feature space is lower or equal than the 
map dimension, this reduction works smoothly, but if this intrinsic dimension ex¬ 
ceeds the map dimension, we face the so-called dimension conflict, i.e. the problem 
to represent a higher dimensional space with a lesser dimensional one. The SOM 
tries to still provide a good representation of the input by folding itself into the 
space, but is unable to take the topological structure of the input into account as 
even closely neighboring input often belong to Voronoi cells of map nodes, which 
are located on different foldings and therefore maybe far apart from each other in 
the map lattice. 


1.2 General Riemannian SOM 


Many different modifications of the SOM have been proposed over the last decades. 
Some of them like the Hyperbolic SOM [Rit99] focused on replacing the coopera¬ 
tion process by fixing the nodes in more complex lattices in a hyperbolic map space 
instead of the usual Euclidean square lattice above, which solved the dimensional 
conflict for at least Euclidean input space. Others, as the Batch SOM, changed the 
adaption process to improve the behavior of the SOM for non-stationary environ¬ 
ments [BMO05] and again others combined the SOM with kernel methods and thus 
adapted the competition procedure by transforming the input space [BJRV08]. 

Our modification called General Riemannian SOM will now be aimed at allowing 
the SOM to work on a whole range of different feature spaces as well as maps by 
complicating the SOM algorithm as little as possible. To achieve this goal, we will 
first have a look at the intrinsic properties of both the map and the feature space of 
the classic SOM and then try to generalize them to meet our needs. 


1.2.1 Map space 

Although the map is above just defined as a discrete lattice, we can consider the 
nodes of the map to be embedded in a certain space. Obviously, there are hardly any 
demands to the embedding space. That is, that we only have to be able to quantify 
the cooperation between two neurons, i.e. a kind of distance between two nodes in 
the map which is then used in the neighborhood function 1 . Hence a suitable choice 
is, to embed the map in a metric space, such that we can then naturally obtain the 
needed distance function on the nodes of the map lattice by transferring the dis¬ 
tance function defined in the metric space. 

In the numerical experiments we will mostly use Riemannian manifolds like the 
Euclidean Space or the Hyperbolic Space, but in general every set of points which has 


1 This also determines the topological structure that is inherent in the lattice 
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defined a metric on it can be used. The most simple one is the trivial space i.e. a set 
of points associated with the discrete metric 

d(x,x) = 0 
d(x,y) = 1 if x ^ y 

In this case, however, the topology is also the discrete topology and therefore each 
neighborhood. So its preservation is trivial. Choosing a suitable neighborhood func¬ 
tion, we just get a simple vector quantization. 


1.2.2 Feature space 

For the feature space, we have to consider that both the competition and the adap¬ 
tion process take place there. As we have seen above in the discussion of the classic 
SOM, the search for the neuron with the maximal response to given input v, defined 
by the inner product of weight vector and input position, can be substituted by the 
search for the neuron with the least Euclidean distance to the input in the feature 
space. While the definition of the response based on the inner product does not 
make sense in more general spaces (e.g. non-pre-Hilbert spaces), distance functions 
represent a more general structure 2 . Thus, in respect to the competition process, it 
suffices to have a general metric space. 

The second task, we have to be able to perform in the feature space, is the adaption 
process. In order to update the representatives, we need to determine how to move 
a point in direction to another one, or to be more exact, move the w° ld towards the 
input sample v. In the classic adaption process, this is done, as seen above, by calcu¬ 
lating the difference of the position vectors of the input and the neuron that has to 
be moved and adding it partially to the current position of this neuron. Obviously, 
this update method can be transferred to every other (finite-dimensional) vector 
space. But for more general spaces, like the Dini's surface seen on the titlepage, 
which is not a vector space, we have to find a more appropriated definition of the 
update rule. As we mentioned above, this vector operations can be interpreted as a 
move of the neuron along an Euclidean straight line, which connects both the neu¬ 
ron and the input. The concept of a line is now not restricted to the Euclidean space, 
but exists at least in many more general metric spaces. The so-called geodesic line 
is thereby, according to (Differential) Geometry (cf.e.g.[Aub01],[AMR88]), defined 
as a local length minimizing path in respect to the metric of the particular space. 
Thus, a suitable new definition of the update rule would be, to define the update in 
general as a move along these lines while the response of the neuron on the input 
determines, how far the neuron is moved towards the location of the input. 

We have, nevertheless, to consider, that in general metric spaces the shortest path 
has not to be neither existent nor unique, but both existence and uniqueness are 
essential for our modified update rule. So, in order to satisfy this requirements, 
we have to put a further restriction onto the used feature spaces. Fortunately this 


2 Each pre-Hilbert space has a naturally defined metric which turns it into a metric space, but there 
exists many metric spaces which are not even vector spaces 
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condition is always (at least) locally fulfilled by (pseudo-)Riemannian manifolds 
(cf.fAubOl, Thm.5.14]). Thus to ensure that our modified adaption process that we 
will define below is always well-defined we restrict the possible choice of the fea¬ 
ture space to those manifolds. 


1.2.3 The GRiSOM Algorithm 

The conclusion of the two preceding sections is, that it is possible, by just having a 
more generalized view on the intrinsic prerequisites of the three processes, to gen¬ 
eralize the classic SOM in a quite natural manner. This generalized model provides 
a huge range of possible configurations of the SOM in respect to the map and fea¬ 
ture space, which will, from now on, improve our ability to choose the right spaces 
which reflect the structure of the feature space (which is e.g. Hyperbolic or Spheri¬ 
cal as in the three "travelling ruler problems") and the expected intrinsic structure 
of the input (e.g. directional or hierarchical) most adequately to avoid, for example, 
dimensionality conflicts. 

By taking the modification made to the competition, cooperation and adaption pro¬ 
cess now into account, we can finally obtain the following modified SOM algo¬ 
rithm: 

Step 1: (Initialization) Given a set of neurons A, fix the position of these neurons in 
the map space M and choose suitable initial locations for them in the feature 
space F (e.g. by random). 

Step 2: (Sampling) Draw an input sample v in F according to sensory input or a pre¬ 
specified probability distribution. 

Step 3: (Matching) Determine winner neuron s := i(v) which is the neuron that 
matches v best i.e. has the nearest representatives to the location of the sam¬ 
ple in the feature space according to the metric/distance function dist in this 
feature space: 

dist(w Sr v) = min dist(iv r ,v) 
reA 

Step 4: (Updating) Adjust locations of neurons in feature space according to rule 
found above for the process of the Synaptic Adaption: 

r< ew = p 

with p G F satisfying the following conditions 

d(w? M ,p)+d(p,v) =d{wf d ,v), = £ -H r >s) 

that means that the representative is moved along a geodesic between re/ ld 
and v by the relative distance according to the result of the neighborhood 
function and the step size. This adaption algorithm is well-defined as long as 
the geodesics are unique. 
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Step 5: return to Step Step 2: 

According to the section above, this will be the case for the right choice of the feature 
space. We will take a detailed look on this aspect with regard to the concrete spaces 
used in the simulations right in the next chapter. But before, we will briefly have a 
look at the (classes of) neighborhood functions that will be used in the SOM. 


1.3 Neighborhood functions 


As mentioned above, the SOM allows winner neurons to interact with its neighbor¬ 
hood. This cooperation is then implemented by the neighborhood function, which 
depends usually only on distances in the mapspace and determines how extensive 
the adaption process is. We will distinguish here between three classes of possible 
functions: 

Long range / Gaussian One representative of the long-range neighborhood func¬ 
tions is the Gaussian function (with mean value }i = 0) 


or in the case of a discrete grid of neurons 

h rr' USS = E^+V' eX P(“ ^2 ) (l- 2 ) 

where cr 2 is the variance. With this class of functions in use, the excitation is 
laterally inhibited (i.e. h is strictly decreasing), but has no upper limit for the 
range and thus affects the whole set of neurons. This results in an enormous 
need of computing time since in every step, each neuron has to be adapted. 

Short range / NN To evade the problem of the excessive needs of the long-ranged 
functions in respect to computing time, the short-ranged functions possess an 
(bounded) compact support and therefore the excitation is locally restricted. 
Thus only a small fraction of the neurons has to be updated which acceler¬ 
ate the whole algorithm at the cost of losing interactions beyond the neigh¬ 
borhood defined by support. An example for this family of functions is the 
following function: 


h NN (x ) = @(x-d NN ) 


1 if neurons are nearest neighbors 
0 otherwise 


(1.3) 


where is the distance of the nearest neighbors. We will consider this func¬ 
tion besides the Gaussian one when working both numerically and analyti¬ 
cally. 
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Ultra-short range / VQ In the limit of small ranges and small variances both cases 
above lose their capability to assure the preservation of the neighborhood in 
the feature space. Only winner neurons are adapted, i.e. we have a case of 
vector quantization. 


h v ®(x) = S(x) 


(1.4) 



Part III 

Preparatory work 




Chapter 2 

Certain Aspects of Riemannian manifolds 


In the preceding chapter we have defined the GRiSOM using Riemannian mani¬ 
folds as feature (and map) spaces. Since this offers a field of study that is too broad 
for the scope of this thesis, we will focus on the two- and three-dimensional man¬ 
ifolds with constant sectional curvatures -1, 0 and 1. These are (in same order) the 
Hyperbolic, Euclidean and Spherical space (or surface) corresponding to the three 
feature spaces that we have encountered in the three modified travelling sales¬ 
man problems. Furthermore, have Hyperbolic or Hierarchical Self-Organizing Maps 
(HSOM) been already considered in present works, e.g. Ritter et al. (cf.[Rit99], [OR06], 
[ORGGOl]) to improve the clustering and dimension reduction for hierarchical data 
set. These types of SOMs therefore have maps embedded in the Hyperbolic space 
which take the hierarchical structure into account by providing an exponential growth 
of the neighborhoods in the map space. So it stands to reason to use this config¬ 
uration as well in the present work to study the properties of SOMs with Non- 
Euclidean feature spaces and extend these studies to non-euclidean feature spaces. 

The most common way to introduce non-euclidean spaces is to present concrete 
models and define things like geodesics/lines and distance measures in them. Some 
of them offer a simple way to visualize e.g. the hyperbolic or spherical plane in a 
way, that can be easily understood. Most popular in the hyperbolic case are the 
hyperboloid model which is an isometric embedding of the hyperbolic plane in the 
Minkowski space and the Poincare Disk/Sphere model. It is not wrong to decide to 
work only in a particular model since the hyperbolic space is categorical i.e. its ax¬ 
ioms define the geometry completely and uniquely and thus all representations are 
isomorphic, but it may mislead readers to associate every definition in this space 
with only a single particular model, which may complicate the comprehension of 
some structures we will have to derive. Thus we will take another, more fundamen¬ 
tal approach following [RR95]. The euclidean, spherical and hyperbolic geometry 
will be introduced by their axioms, not just to get a feel for the fundamental struc¬ 
tures, but also see, how similar the three geometries are and where are their differ¬ 
ences. First this will be done in two dimensions and then be generalized to three 
dimensions for the hyperbolic case. After these first steps, the later used analytic 
models and representations will be introduced. The "readers in hurry" may focus 
on these sections. 

Therefore we present in the last part of this chapter the formulas concerning prop¬ 
erties and structures, which are needed for the further studies, with an emphasis on 
hyperbolic geometry. 
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2.1 Euclidean Geometry 


Although geometry is one of the oldest sciences and is found throughout many dif¬ 
ferent human cultures for millennia like the famous studies of Euclid in Ancient 
Greece (ca. 300 BC) or the less known work of the followers of Mozi (ca. 330 BC, 
China), it had, for long time, a flawed structure with rather informal axioms. This 
was fixed by mathematicians at the end of the 19th century, especially by David 
Hilbert who reworked the whole structure by strict axiomatic reasoning and pre¬ 
sented 1899 with his "Grundlagen der Geometrie" [Hil99] the modern foundations 
of geometry 

2.1.1 *Hilbert's Axioms of the Euclidean Plane 

Given a plane P i.e. a set of a priori undefined things called points and subsets of it 
isomorphic to R called lines, the axioms of Hilbert provide a structure on it and we 
get the Euclidean plane E 2 : 

Axiom 1 (line). If A and B are distinct points, i.e. the symbols A and B denote different 
points, then there is one and only one line that passes through them both. This line is denoted 
by AB. 

Axiom 2 (metric space). There is a function |PQ| defined for all pairs of points P, Q, such 
that 

(i) |PQ| > 0 (non-negativity) 

(ii) |PQ| = |QP| ( symmetry) 

(Hi) |PQ| < |PR| + \RQ\for all arbitrary points R (triangle inequality) 

Axiom 3 (Isometric Isomorphism of line and R). For each line l, there exists a bijection 
x called coordinatization from the set l to the set R, such that if A and B are any points 
on l, then 

\AB\ = |x(A) ~x(B)\ 

Definition 2.1.1 (ray). Given any line l, x a coordinatization of l and any point A 
on this line, then the subset r C {Z £ l\x(A) > x(Z) } is a ray with origin A. The set 
l n r c is also a ray (called opposite ray) with the same origin and will be denoted — f. 

Definition 2.1.2 (segment). Given any line l and two points A and B on /, then the 
segment AB is the subset of l defined by 

{Z G l\x(A) < x(Z) < x(B)Vx(B) < x(Z) < x(A)} 

and \AB\ is called its length. 

Axiom 4 (Half-planes). If l is any line, it defines a partition {HPi,HP 2 } of the plane, 
where the “parts” are the corresponding half-plane, such that 

(i) ifP and Q are in the same “part”, the segment PQ contains no point of l 

(ii) ifP and Q are in opposite “parts”, the segment PQ contains a single point of l 
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Axiom 5 (Measures of angles). It exists an mapping rad any angle Zhk (where h and k 
have the same origin) to the interval [0, n] called the (radian) measure of the angle, such 
that 

(i) h = fc => rad(Zhk) = 0 and h = —k => rad(Zhk) = n 

(ii) the stun of an angle and its supplement is n 

(iii) if j is in the interior of an angle Zhk, then rad( Zhj) + rad( Zjk) - rad( Zhk) 

(iv) if a ray kfrom a point Z lies in a line l, then for each half-plane defined by l the measure 
rad is a bijection from the rays j with origin Z lying in the particular half-plane to 
the set of real numbers a in ]0, rc[ 

Definition 2.1.3 (triangle). A triangle is a set of three (non-collinear) points A, B, C 
denoted by A ABC and the segments AB,AC and BC are called its sides. 

Definition 2.1.4 (congruence of segments and angles). If two segments have the 
same length or two angle the same measure, they are called congruent. 

Axiom 6 (congruence of triangles / side-angle-side criterion). Given two triangles. 
If two sides and the included angle of the first triangle are congruent to two sides and the 
included angle of the second one, then these triangles are congruent. 

Axiom 7 (Euclidean parallel axiom). Given any line and given any point not on it, there 
is only one line through the given point that never intersects the given line. 

Even though this "construction" above is for the 2-dimensional space, it can be 
easily adjusted to match for higher dimensional spaces. An example how to do this 
will be given when defining the axioms of the 3-dimensional hyperbolic space in 
section 2.3. 


2.1.2 Models 

The standard (analytic) model of the Euclidean plane/space is well known since al¬ 
gebra lessons in school. A point is represented as an element of R” i.e. by a ordered 
set / tuple (x{)i where the x, E R are called its Cartesian Coordinates. We will show 
below, that, indeed, lines are defined e.g. in two dimensions by either {x\x\ = a} 
if they are vertical or {x\x 2 = mx\ + b } otherwise. Furthermore a distance function 
i.e. d(x,y) = (x — y) T ■ {x — y) is defined by the canonical metric. 
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d(x = ( Xi)i,y = (y f )i) 
ds 2 
^ G 




i 


n 


where ds 2 is the line element and G its correspond¬ 
ing metric tensor. 


FIGURE 3: Cartesian coordinate system 
in three dimensions 

Another common model is the polar or (hyper-)spherical model where each point in 
the space is identified by its distance from the origin and certain angles. 



n —1 

ds 2 = dr 2 + r 2 (Y2 sin' ! ~ !+1 (0 ( )d© 2 }2.1) 

i=l 

/ 1 0 0 

Or 2 0 

0 0 r 2 sin ©2 


= 


\ 


V 


( 2 . 2 ) 




r is hereby the distance from the origin whereas 
©d) ( = ©) i s the azimuth and 0^ (= O),©* 3 ),... 
are the angles between the 3rd, 4th,... - axis. 


Like fig.4 suggests, the transformation from 
spherical to cartesian coordinates is 


FIGURE 4: Spherical coordinate system 
in three dimensions [K 0 IO 8 ] 


Xi 

= rcos(©i) 


*2 

= rsin(©i) 

cos(© 2 ) 

X-n —1 

= rsin(©i) 

• • •sin(© iJ _ 2 )cos(0„_i) 

Xn 

= rcos(©i) 

• • •sin(0 n _ 2 )sin(0„_i) 


2.2 Spherical Geometry 


Spherical Geometry has a long history as spheres and ball naturally fascinates mankind 
since the ancient age due to their high degree of symmetry (which is often seen as a 
reflection of divine perfection). Besides the pure fascination for this bodies, the most 
important motivation have been the many applications as in geography, astronomy 
and navigation. 
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2.2.1 * Axioms of Spherical space 

Although Spherical Geometry is not so close to the Euclidean Geometry as the Hy¬ 
perbolic case, that we discuss further down in this chapter, most of the axioms still 
hold (at least locally) when they are slightly adapted i.e. 3 

Axiom 8 (Isometric Isomorphism of line and R/Z). For each line l, there exists a 
bijection x called coordinatization from the set l to the set R/Z, such that if A and B are 
any points on l, then 


\AB\ = | min(x(A) — x(B),x(B) — x(A))| 


Furthermore, we have to modify the definition of ray and have to deal with the fact 
that we loose the uniqueness of line when encountering antipodal points (cf.[Har02, 
Ch.7]). The axiom that obviously fails totally is the parallel postulate. Instead of it 
we have to use the following: 

Axiom 7 (Spherical parallel axiom). Given a line and and a point not on it, no lines 
exists through that point that never intersects the given line. 4 

2.2.2 Models 

The most known model of the two-dimensional spherical space is the geographic 
coordinate system, that is used to map the earth. It is quite the same as the Euclidean 
spherical coordinates model with a fixed r. The only difference is, that the polar 
angle <p is replaced by the latitude </>Lat. = f — <p whereas the azimuth stays the 
same and is called longitude A. The metric is thus given by 


ds 2 = dfl at + (cos(4> L at.)) 2 dA 2 



The distance of two points A and B can furthermore be directly computed by us¬ 
ing 

d(A, B) = arccos(cos(</>L at .) cos(<^ at ) cos(A a - A B ) + sin(^ at ) sin(<^ at ) 

2.3 Hyperbolic Geometry 

The Hyperbolic Geometry has many different roots (cf.[CFK + 97]). Many of them re¬ 
sult from efforts to derive the parallel postulate (Axiom 7). Many mathematicians 


3 The reader should, nevertheless, not to be mislead to attach too much value to the similarities be¬ 
tween the Spherical and the other geometries. They just enable us to handle the structures on these 
geometries in analogous ways. In principal, the geometrical attributes of the surface of a sphere 
(unlike those of the Euclidean and Hyperbolic plane) are not consistent with all the properties of 
a common Hilbert space. 

4 In other words, two lines always intersect. 
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since the ancient times like Proclus (ca. 400 AD) or Kastner and Kliigel in the 18th 
century tried to prove that it would be possible to deduce this postulate by using 
the other ones 5 . Due to the many failed attempts, the approach finally changed in 
the 19th century when mathematicians like Schweikart and Taurinius (1794-1816) 
or Gauss, Lobachevski and Bolyai no longer tried to find a proof but focused on 
the consequences of denying this axiom. This resulted in a first grasp of the non- 
Euclidean geometry but still without an analytic understanding or even model. The 
fundament for this was then laid by Euler's studies (among others) of curved sur¬ 
faces and resulted finally in the theory of Riemannian manifolds in the middle of 
the 19th century and the subject of Differential Geometry. We will also make use of 
this work later when we will deduce certain structures in the hyperbolic space as 
it is a simply connected Riemannian manifold with a constant negative (sectional) 
curvature. 

Similar to the spherical geometry, the parallel postulate no longer holds here. In 
the two-dimensional case of the hyperbolic plane it is thus replaced by 

Axiom 7 (Hyperbolic parallel axiom). There exists a line and a point not on it, such that 
there are at least 6 two lines through that point that never intersects the given line. 

Beside this, all the other axioms and definitions of the Euclidean case are still valid 
here without any further adaption. 

Without getting too much into details it is still important to remark that the Hyper¬ 
bolic (as well as the Euclidean and Spherical) axioms are consistent and categorical. 
The consistency follows directly from the existence of the models shown below. The 
proof of the categoricalness is more subtle and will be skipped here. The interested 
is therefore referred to e.g. [RR95, chap. 7.7]. An important consequence of these 
two properties is, that we can freely choose an arbitrary model to prove anything 
we want and the results will still hold in any other model. 


2.3.1 * Axioms of Hyperbolic space 

Before concerning ourselves with concrete models we will have to discuss how to 
extend the hyperbolic plane given by the axioms above to spaces of higher dimen¬ 
sions. This is necessary as we want to use non-Euclidean spaces as feature spaces. 
Thus we will take another look on the axioms and try to adapt them in the follow¬ 
ing for at least the three-dimensional hyperbolic space. 

The Axioms 1, 2, 3 and 6 are exactly the same in the three dimensional case, so 
they will be skipped here. Instead, there will be three additional axioms: 

Axiom 4 (Half-planes). Give a plane, then any line in this plane separates it into two 
half-planes as in Axiom 4 in the planar case. 

Axiom 5 (Measures of angles). It exists an angle measure rad such that any angle that 
lie in a given plane satisfy the Axiom 5 in the two dimensional case. 

5 They actually didn't use the axioms listed above but based their attempts on older equivalents like 
the fives axioms of Euclid. 

6 in effect there are infinitely many for any line and any point not on it 
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Axiom 7 (Hyperbolic parallel axiom). There exists a plane, a line and a point in that 
plane such that the point is not on the line and that at least two other lines in the plane can 
be found that pass the point but don't intersect the first given line. 

Axiom 8. Given two distinct points in a plane, the line defied by them (cf.Axiom 1) lies 
entirely in the plane. 

Axiom 9. Any three noncollinear points lie in a unique plane. 

Axiom 10 (Half-space). Given any plane P there are two so-called half-spaces HS\ and 
HS 2 bordered by P. That are subsets of H 3 such that both and P are a partition of the 
space and thus pair-wise disjoint and if 

1) if points A and B are in the same half-space, the line segment between them does not 
intersect P 

2) if points A and B are in opposite half-spaces, the the line segment between them 
intersect P (in a single point) 

Corollary 2.3.1. Obviously, the axioms of the two-dimensional case hold in any two- 
dimensional plane in the three-dimensional (or even higher-dimensional) space. 


2.3.2 Models 


There are several analytical models of the Hyperbolic n-space that are commonly 
used. We will present here the two of them that we will use in this thesis. We will 
also restrict the discussion to the most basic attributes i.e the metric and the rep¬ 
resentation of points. All other concrete structures can then be deduced using the 
axioms above. We will do this in an extra section following below. 


Polar resp. (hyper-)spherical model The polar or, in more dimensions, (hyper-)spherical 
coordinates are defined in the same way they are defined in the Euclidean case 
as shown in fig. 4 for three dimensions. Since we use only two and three di¬ 
mensional spaces, we will restrict the following description to these two cases. 

The metric in 2-D (i.e. polar coordinates) is then given by 

ds 2 = dr 2 + (sinh(r)) 2 d© 2 (2.3) 

G = f 1 ° ) 

y 0 sinh(r) J 

In three dimensions this is generalized to 

ds 2 = dr 2 + (sinh(r)) 2 (sin 2 (pd© 2 + d<5> 2 ) (2.4) 

/ 1 0 ° \ 

G = I 0 sinh(r) 2 sin 2 <^ 0 J (2.5) 

\ 0 0 sinh(r) 2 / 


The spherical model may not be as instructive as the two Poincare models 
following below concerning e.g. geodesic lines and visualization, but it sim¬ 
plifies some calculations because e.g. regarding a point in the origin, r pro¬ 
vides directly the distance form this point to any other point without having 
to compute any "nasty" hyperbolic trigonometric functions or complicated 
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transformations. So when facing certain problems in the hyperbolic space, we 
will often retreat from the given model to this one and then convert the result 
back into the original model. 

Poincare disk/sphere model The (n-dimensional) Poincare disk/sphere model, of¬ 
ten denoted by ID", is an (non-isometric) embedding of the hyperbolic space 
into the 1R" where geodesic lines are either Euclidean straight lines passing 
the origin or Euclidean circles which intersect the circle at infinity perpendicu¬ 
larly as shown in fig.5. 

Historically, there are several concrete ways to obtain this models starting 
at other models. Given the Poincare Half-Space model the Poincare Sphere 
model can be constructed by "rolling up" the bottom line at infinity whereas 
given the hyperboloid model the Poincare sphere can be obtained by stereo¬ 
graphic projection. A more detailed view on this subject can be found e.g. in 
[CFK+97]). 

The model furthermore is very similar to the spherical model and thus both 





FIGURE 5: 2-dim. Poincare disk (grayed-out) with some (colored) geodesic lines 

can be easily converted into each other by transforming the spherical coor¬ 
dinates (rpD, 0^D,02,...) of the hyperbolic space into the spherical coordi¬ 
nates (rpD, 0^D,02,...) of the Euclidean space, in which the model is em¬ 
bedded, and then converting these spherical coordinates into cartesian ones 
(i'i, * 2 , ■ ■ ■ )• We get similar to the transformation between cartesian and spher¬ 
ical coordinates in the Euclidean space (cf. section 2.1.2) an isomorphism from 
spherical coordinates to Poincare disk: 


Ml 




( 2 . 6 ) 
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The metric in the n-dimensional case is defined by: 


G 

ds 2 


2 1 

(i -Liu 2 ) n 

4 Ljdu 2 

(1-Il«l| 2 ) 2 


(2.7) 

( 2 . 8 ) 


A nice feature of the Poincare model is that the distance between two arbitrary 
points P and Q can be easily explicitly computed using the following formula: 


d(P,Q) 


acosh(l + 2 


II P~> 


(i-m(i-nr) 


) 


where p and q are the representations of P and Q. 


(2.9) 


Remark: Henceforth, we will work in the Hyperbolic hyperspherical model unless 
otherwise noted. 
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2.4 Various Formulas 

In this section we will examine some important structures and deduce some formu¬ 
las of the Euclidean and Hyperbolic space that we will need in the next chapters. 


2.4.1 ^Isometries 

One important class of transformations in the metric spaces we defined above are 
isometries i.e. distance-preserving transformations. 


^Isometries in Euclidean space 

In the Euclidean case the isometries are elements of the common Euclidean group 
with subgroups like translations and rotations (about an axis or, in general, sim¬ 
plex). Like in the following hyperbolic case we won't discuss the group in detail but 
we will focus instead on the subgroup that we will need in the course of this thesis. 
In the n-dimensional Euclidean space this will be the subgroups already mentioned 
above combined with reflections in (n-1) hyper-planes. Their corresponding repre¬ 
sentations in e.g. the cartesian coordinates model are thereby 

translations: Tj(p) = p + a where a is the vector of translation 

rotations about the origin: Rd(P) = D ■ p with D S SO(n ) 

(main) reflections: Mj(p) =~p where pj = pj if j -f i and — p,- otherwise. 

Instead of general reflections we defined above the so-called main reflexions i.e. the 
reflexions that invert the orientation of exactly one given axis in the space. They 
do not form a group by themselves, but combined with the rotations and transla¬ 
tions they generate the group of the reflexions about any subspace. To be more ex¬ 
act, any reflection in an arbitrary (n-1) hyper-planes can be obtained by coordinate 
transformations via translations, rotations and main reflections. Analogously we 
can decompose each rotation into main rotations that are the rotations in the planes 
spanned by two arbitrary axes. In the cartesian coordinates model the (z’,/’)-th main 
rotation is then defined by (cf. [APA04]) 

mainrotation ; -y(p, #) = D ■ p (2.10) 

such that the (z,/)-submatrix of D is element the two-dimensional rotation with ro¬ 
tation angle & and the rest of D is just the identity matrix. 

The whole (sub)group of isometries that we will use is then obtained by combining 
all these subgroups/generators. 
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^Isometries in Hyperbolic space 

For the hyperbolic space we will start by describing the complete isometry group of 
the two-dimensional Poincare Disk model D. As shown in [And05] it is isomorphic 
to a subgroup of the Moebius group i.e. the projective special linear group PGL(2, R). It 
can be shown that in the given model, the set of these transformations is defined by 


Isom(D) = | \ a ' V G C 'll fl ll = l/||pll < bg G {Id,Conjugation} | 

( 2 . 11 ) 

In the case that p = 0 we get a pure rotation (together with a reflection if g is the 
conjugation) whereas for a = 1 we get pure translations. This simplifies our work 
with these isometries when we need to construct certain isometric transformations. 
While we won't show that these are in fact already all isometries we will at least 
briefly prove the following 

Theorem 2.4.1. All elements of the set Isom(D) as defined above are indeed isometries 
and this set forms a group using the complex multiplication as the group operation and is 
therefore closed under this operation. 


Proof. To show that the distances are preserved we make use of Eq.2.9. Let T be 
an element of Isom( ID) and v and w two arbitrary points in D. Since the acosh is 
strictly monotonic increasing it is now sufficient to check that the fraction in Eq.2.9 
is invariant under substitution of v and w by its transformations. Furthermore, it 
suffices to show this only for pure translations since a rotation about the origin and 
reflection in the real axis clearly preserve the norms and therefore the distance: 


| T a ,p,g(v) Ta iP 'g(zv) || — 


ag{v) + p ag(iv) + p 


pag(v) + 1 pag(zv) + 1 


ll p ~ w \\ 2 (1 ~ llp || 2 ) 2 

||pv + 1|| 2 ||pw + 1|| 2 


A(1 ||^«,p,g( z; )|| )(1 || T a ,p,g(zv) || ) — (1 

= _ (1-llPll 2 ) 2 _ 

II pv + 1|| 2 \\pZV + 1|| 2 (1 — vv){l — ZVZv) 

|| Ta,p,g(v) — Ta,p,g(zv) || 


ag(v) + p 


pag(v) +1 


\v — zv\ 


)( 1 - 


ag{v) 


pag(v) +1 


(i - ||r fl ^(u)|| 2 )(i - IIt^hH 2 ) (1 - ||u|| 2 )(l - ||w|| 2 ) 

d{T{v),T(zv)) = d{v,zv) 


To prove that Isom(D) is a group we have to check the axioms of the group. 

• Existence of neutral element: Id Isom ( D ) = T 1/ o / id 

• Existence of inverse element: Let T a ,-p,g be an arbitrary element of Isom(D) 
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and z and arbitrary point in D. Then we get 



- py) 


y-v 


T -i (v s = g(«)g(y) + g(-p«) 
g(-pa)g(a)g(y) + l 


This is obviously an element of Isom(D) for any T. 

• Closure: Let T = T arP/g and f = Ta,p lg be two arbitrary elements of Isom(D) 




a g( s g(z)) + ag(p) + agfyagjz )) + a 
yag(ag(z)) + pag(p) + gfyag(z)) + 1 

A . a(n) 4- P 



where A,P and G are the parameters of the resulting transformation. Ta,p,g is 
again an element of Isom(D) as ||A|| = 1 and G = Id if g = g and g = Conj. 
otherwise. 

• Associativity: With the formula for the result of an group operation obtained 
above, the associativity of Isom (ID) follows directly from the associativity of 
the complex numbers. 

Thus, besides calculating some formulas about these isometries that we will need 
later on, we have shown the claim that Isom(D) is indeed a group of isometries in 
the Poincare Disk model. □ 

For the higher dimensional case we will just note that the group generated by the 
Euclidean main rotations is a subgroup of the isometry group of the Poincare n- 
Hypersphere model. This follows since this model is isotropic i.e. any structure 
won't change if we rotate the whole space about the origin. This thereby is true as 
we deduced (or will deduce) any structure from the given metric and the axioms 
which are obviously all invariant under this kind of transformation. 

2.4.2 Geodesic lines 

Until now we have avoided to discuss what exactly lines are. Taking a close look at 
Axiom 2 and 3, we can deduce that, taken two arbitrary points A and B, the seg¬ 
ment of the line between these points has to be a (local) distance minimizing curve 
i.e. a so-called geodesic or geodesic lines. We will now show how we can determine 
in general the set of points forming a line in any of the two-dimensional analytic 
models above. The higher-dimensional lines can then be derived analogously. 


Let A and B two arbitrary points, Co the line segment between them and a = x(A) 
and b = x(B) the coordinates of these points in respect the coordinatization x of 
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the line. Furthermore let u and v be coordinates and G(u,v ) the metric tensor of the 
model. We can therefore describe the line in terms of the coordinates of the model 
as follows: 

x e~R,u a = u(o),va = v(a),u B = u{b),v B = v(b) 

with u, v continuous and distance minimizing 

By Axiom 3 we then get: 


AB = \ (u(x),v(x))\ 


b — a = 


|AB|=s = j ds = J \Jgn(u,v)du 2 + 2gi 2 (u,v)dudv + g 2 i{u,v)dv 2 
j \Jgn{u(x),v(x))u 2 +lg 12 (u(x),v(x))uv + g 22 (u(x),v(x))v 2 dx 
I \J<F>(u(x),v(x),ii(x),v(x))dx 


where s is the arclength of Co, ds 2 is the line element belonging to the given metric 
and u and v denotes the derivative of the coordinates in respect to the parameter x. 

Our goal is now to determine the functions, or to be more exact, functionals u(x) 

and v(x) that belong to the line i.e. minimize the arclength. This is obviously an 
extremum problem. A common method to solve it is to use calculus of variation. 
Thus, let C a small variation of Co from A to B determined by function(al)s 

u(x) = u(x)+ef(x) (2.12) 

v(x) = u(x)+eg(x) (2.13) 


where / and g are arbitrary smooth functions vanishing for a and b and £ is the 
parameter of the variation. Thus we obtain for the length of C: 

length(C) = length(e) = J \J<&(u(x),v(x),Q(x),‘d(x))dx (2.14) 

Since the length has to have a minimum at £ = 0 for any choice of / and g, it is 
required that the derivation in respect to £ vanishes there 


dlength(C) 


£=0 = 


dlengthfe) 


de 
2 ' de 


= ^ f n \J < ^{u(x),v(x),d(x),'d(x))dx 


£=0 


r b 1 ^ 90 , , > 

-v O—e^o dx = 0 











26 


Chapter 2 Certain Aspects ofRiemannian manifolds 


Eqn.2.12 and 2.13 and Integration by parts yield 


b 1 . /^ 9 °( w + £ f> v + eg, ^(u + ef), j- x {v + eg)) 


Vd 

la ^ 


de 


= ndx = 0 


e=0 


<3> 


< 3 > 


r— ( 30 df 30 3? \ , 

^ ( M 3^ + 33 fh? ^ |t:=0 * = 




u*+ 

,[ ^ (^ /w + | ->* 

- f^ ((4lf) /w+ {§-M) *«) ^=° 

f ^ -1^) /<")*!-»+f ^ ?W^I-o= 0 


As this equation has to hold for any / and g, it follows that along Co the following 
two differential equations called equations of geodesics have to be fulfilled 


30 3 30 _ 

3 u dx dii 


30 3 30 _ 

dv dx 33 


(a < x < b) 


(2.15) 


We will use this result to apply it on two models. 


Geodesics in Euclidean space 

In the Euclidean n-space we will determine the geodesics in the cartesian coordi¬ 
nates model. According to the metric shown in section 2.1.2 O is given by 

n 

0(^1, ...,x nr Xx,...,x n ) = Y^x} (2.16) 

i =1 

Thus we obtain n equations of geodesics: 


30 3 30 

3 X{ dx d±i 





i E n} 


Hence the geodesics are the known straight lines. 


Geodesics in Hyperbolic space 

As seen in section 2.1.2 the line element in the two-dimensional hyperbolic spherical 
coordinates model is 

ds 2 = dr 2 + sinh 2 (r)d0 2 => O (r,6,r, 9) = r 2 + sinh 2 (r)0 2 
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By using this in Eq.2.15 we get 


9 , 

2sinh(r) cosh(r) — 2f = 0 A —s —(sinh 2 (r)0) = 0 

By integrating the right equation and substituting the result in the left one, we ob¬ 
tain: 


sinh 2 (r)0 = A o 9 = 


Ci 


sinh 2 (r) 


7 cosh(r) v . 9 . . C?cosh(r) 

c?- 5 ^- = (r) => (2r) —- 

sinh 3 (r) 9* sinh 3 (r) 


= (2 r)r 2 


Cl 


sinh 2 (r) 


+ r 2 = C 2 


(2.17) 


(2.18) 


where Ci and C 2 are constants. Ci is hereby depending on the two points uniquely 
defining the geodesic curve whereas C 2 is depending on the parameterization of 
the line. We will now focus on the case that C\ = 0. Then Eq.2.17 and 2.18 yield 


Q = 0 


r = C 2 A 0 = 0 
r = C\ ■ x A 9 = const. 


These are curves that pass through the origin. Transferring them to the Poincare 
Disk model would result in Euclidean lines in the embedding complex plane. 
Considering the case of Ci ^ 0 would give us the other geodesics (which are not 
passing the origin and are represented by the segments of certain (Euclidean) cir¬ 
cles in the Poincare model). But instead of calculating them we can use isometric 
transformations of the particular model namely rotations around the origin and 
translations to convert any possible case in the hyperbolic n-space to the one case 
we already solved above. Indeed, according to Cor.2.3.1 we can handle any problem 
restricted to an arbitrary plane in the n-space as a problem in the two-dimensional 
hyperbolic space. In the preceding section we already discussed the isometries, 
namely the group generated by transformations, rotations and reflections in the 
two-dimensional Poincare disk model and already briefly mentioned the part main 
rotations are playing in the higher-dimensional spaces. 

Given two arbitrary points in the n-dimensional Poincare Sphere model. We use 
first an modification of the Aguilera-Perez Algorithm presented in [APA04] to con¬ 
secutively rotate both points into the x n _i~x n plane of the embedding IR"-space as 
follows (where a and b are the representations of the points A and B): 

let T = Identity 
for i := 1 to n-1 

M := MainRot at ion, + j ,• ( atan 2 , (ij ) ) 

a := M(fl) 

T = M o T 

b := Kb) 

for i := 1 to n-2 
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M : = MainRotat i onj + y ( at an 2 , £>,) ) 

b := M(fr) 

T = M o T 

Listing 1: modified version of Aguilera-Perez Alg. to rotate both a and b into x n _i~x ,,-plane 

In the x n ~i-x„ plane we can now use the already defined two-dimensional transla¬ 
tion to move one of the points to the origin. Thus, we are in a case where we already 
know how the geodesic look like. To get the points on the geodesic back in the n- 
space, we just have to apply the reversal transformations (reverse translation and 
reverse of rotation M in the algorithm) on the points of the known geodesic in the 
x n -i-x n plane. 

By doing so we get for example in the two- and three-dimensional Poincare sphere 
model the results that are shown in fig.6 




FIGURE 6: Geodesics in (left) 2-dim (right) 3-dim Euclidean (blue) and Hyperbolic space (purple) 


Geodesics in Spherical space 

Although we won't prove it here, it should be briefly noted that the geodesics 
in spherical space are the so called Great Circles. These are the circles around the 
center if we embed the two-dimensional spherical plane naturally in the three- 
dimensional case (i.e. sphere). 


2.4.3 ^Trigonometric laws 

Since the Euclidean and Hyperbolic geometries are quite similar, the well-known 
trigonometric laws of the Euclidean space have their equivalences in the hyperbolic 
space. Given a triangle with sides a,b and c and the corresponding angles a, jf> and 
7 , these are (proof can be found in e.g. [And05, ch.5]): 

Law of sines: 

sinh(fl) sinh(b) sinh(c) 
sin(a) sin(/3) sin( 7 ) 


(2.19) 
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Law of cosines I: 

cosh(c) = cosh(fl) cosh(b) — sinh(a) sinh(&) cos( 7 ) (2.20) 

Law of cosines II: 

cos(j) = — cos(a) cos(jS) + sin(a) sin(/3) cosh(c) (2.21) 

As a property of the hyperbolic space mentioned above, in the limit of infinitesimal 
small triangles we will get the Euclidean version back. 

2.4.4 ^Equidistant curves 

Concluding this chapter, we will determine the formulas for curves and curved 
planes that are equidistant to a straight line or flat plane. Assuming wlog. that the 
representation of the reference plane in the particular models is chosen in such a 
manner that the coordinates except one are invariant under a projection onto a (n- 
l)dim subspace of the same geometry, i.e. a plane containing (n-1) axis. This is al¬ 
ways possible, because we can transform every flat plane into one that fulfills this 
condition by using isometries. 

We will start again with the Euclidean space. In two dimensions, a equidistant curve 
to a straight line / equals a parallel line of /. Thus it is again a (straight) line. 

The generalization to higher dimensions can be defined recursively. A n-dimensional 
equidistant plane in the distance d to the plane P = {(*,),|*i ... ,x n -\ G 1R, = 0} 

is then determined by {(xi)i\xi... ,x n -\ G R,x n = d} 

In hyperbolic space equidistants are no longer straight lines and flat planes, but 
can be again easily derived from the trigonometric laws. 



FIGURE 7: Equidistant curve in 2-dim hyperbolic space 


Given a right triangle as shown in fig. 7 with on vertex at the origin,^ as the distant 
and one side be part of one of the lines, using the law of sines leads to the following 
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formula defining the equidistant curve. 


sinh(r) 


sinh(d) 

sin(0) 


( 2 . 22 ) 


Using (p = n — 0 instead, we just get here a relation between the radius and the 
polar angle of the hyperbolic spherical coordinates model that a equidistant has to 
fulfill. Converting this result e.g. to the two- and three-dimensional Poincare Sphere 
model we finally get the curves and surfaces that are shown in fig.8. 
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boundary at infinity- 

equidistant surface d=0 — 

equidistant surface d=l - 

equidistant surface d =2 - 


FIGURE 8: Equidistant curves and surfaces in (left) 2-dim (right) 3-dim Hyperbolic Poincare Disk/- 
Sphere model 


























Chapter 3 

Generating Maps 


As discussed in ch. 1, the neurons form a special kind of structure in the map space. 
In our case, this structure is chosen in such a way that the characteristics of the 
space, in which it is embedded, are best represented. A common way to satisfy this 
criterion is to partition the the space into a set of regularly bounded regions. These 
form the so called regular tiling or tessellation. The representatives of the neurons in 
the map space are then placed at the positions of the centroids Vj of these tiles given 
by 


= klei^ dV 
' Ail ei*V 

where dV is the volume element of the particular map space. Thus in return taking 
this set of nodes, the corresponding Dirichlet or Voronoi tessellation defined by 

Tile i = {u| ||u — U;|| < ||u — Wj || V/} 

which equals the original tiling. The regularity of the tiling ensures a high degree of 
symmetry like certain rotation and translation invariances that reflect the symme¬ 
tries of the map space itself. 

The representatives that are placed in such a manner and the edges between them 
form itself the dual tiling. As it is easier to create the dual tiling than to calculate the 
centroids of the original tiling, we will use the latter method. To avoid thereby un¬ 
necessary confusion about the nomenclature of the maps (i.e. if a triangular map is 
defined by a triangular Voronoi tessellation of its nodes or by a triangular dual tiling 
that generates the nodes) we will name henceforth the resulting maps by referring 
to the shape of the Voronoi cells. Thus, using our example above, a triangular Eu¬ 
clidean map corresponds to the set of vertices of a partition of hexagonal tiles (and 
vice versa). Furthermore, a map, defined by Schlaefli symbols (cf.[Cox73]), will al¬ 
ways refer to the respective Voronoi tiling of the map described by these Schlaefli 
symbols. 

In the following we will now have a look at the Euclidean and Hyperbolic plane 
and discuss the possible regular tilings. Then we introduce a method to use these 
tilings to generate the map. 
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3.1 Tilings of the Euclidean plane 


To determine the possible regular tilings for the plane i.e. the tiling by regular poly¬ 
gons, we will examine their triangulation, which is a tiling by triangles. Fig. 9 shows 
such a triangulation of certain polygons. We will see below that these polygons are 
all bases for tessellations of the plane and that there exist no others. 




FIGURE 9: triangulation of triangle, square & hexagon 


Each polygon can be divided into triangles by connecting the center with each ver¬ 
tex. So, a n-gon consists of n such triangles. As it was assumed that the n-gons are 
regular, the triangles are all congruent to each other. The angle a belonging to the 
vertex in the center of the polygon is therefore 2n divided by the number of edges 
or vertices of the n-gon. The other two angles /l and 7 are equal. So we get two 
conditions which the interior angles at the vertices of the polygon have to satisfy. 
First, if there are q neighbors meeting at each vertex, these angles have to be 
Secondly, since two of the p triangles meet in each of the vertices, we conclude, that 
the interior angles are just twice as large as beta and gamma. Summarizing all this 
results in 


cx 




2tc 

V 


7T 



We now use the fact that triangles in Euclidean space have neither an angular defect 
nor an angular excess i.e. the sum of the three interior angles is always equal to n 
(in contrast to the spherical and hyperbolic case, in which it is strictly larger resp. 
smaller than tc). Thus we obtain: 


„ 2tc n n 

a + j6 + 7 = 7r=>-(-1— 

p q q 



1 

2 


This is fulfilled for only three choices of p, q 6 IN: 
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(triangular) p = 3,q = 6 —> tx. = 120°,/3 = 7 = 30° 

(square) p = 4,q = 4 —> a = = y = 90° 

(hexagonal) p = 6,q = 3 — t dl = 120°,^ = 7 = 30° 

So, all valid regular tilings of the Euclidean plane consist of sets of identical tiles 
which are congruent to one of these three defined above. Even if it seems quite 
obvious, it is important to mention, that the edge size of this tiles can be chosen at 
will. Hence arbitrary refinements of the tilings and therefore any density of neurons 
are possible. 





FIGURE 10: Regular tesselation of Euclidean plane to Schlaefli symbols (3,6), (4,4) and (6,3) 


3.1.1 Generating tilings 


Each regular tiling belongs to a certain symmetry group. Each member of this group 
thereby maps tilings onto each other. By using the whole group, it is already possi¬ 
ble to generate the whole tiling starting with only one given tile or vertex. Depend¬ 
ing on the tiling there are many ways to choose the symmetries. Following a paper 
of N. Kuiper [Kui88] we will use involutions at the center of the edges, i.e. reflec¬ 
tions perpendicular to the edge or halfturns. The needed transformations can be 
easily composed using the basic isometries of the Euclidean (and hyperbolic) space 
which are rotations R(<p) around origin by angle cp, translations T(z) (moving 0 to 
z) and reflections M along an axis. Given a tiling of p-gons with q neighbors at each 
vertex, the (finite) symmetry group is consisting of 


,2n-i 


R _1 (-)oT _1 ( 


edge length, 


q 


0 M 0 T( 


edge lengthy 2 7T 


i £ {0... p — 1} 


Then every vertex can be created by letting a composition of elements of the group 
acting on a distinguished point 0 called origin, i.e. for every vertex v exists a finite 
sequence (g h ,... ,g in ), g, k G G such that 


(3.1) 

Using the involution in a side center as proposed, every gi k "transports" the point at 
the one end of the side to the other. Fig.11 illustrates this process using each member 
of the symmetry group of the (6,3)-tiling once. 
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Figure 11 


Thus, when applying each of the generator of the group on a certain vertex, we ob¬ 
tain all the neighbors of it, i.e. all vertices connected to the former one by an edge. 
This will be called expansion of the parent vertex creating as many children as there 
are generators. 

It is note-worthy, that the given definition of the generating symmetry group do 
not use p as a parameter i.e. the tiles are already defined by q and the length of 
their edges. In the Euclidean plane p depends on q and vice versa. In the hyperbolic 
case the edge length is also needed. We will see this when discussing hyperbolic 
tilings below. But beside this fact everything in this paragraph so long holds for 
both geometries. 


3.1.2 Periodic boundary conditions 

A useful "feature" of the Euclidean space is the possibility to define periodic bound¬ 
ary conditions quite easily. Boundary conditions are an important method in numer¬ 
ical computations. Since every simulation has a limited computational time, it is 
only possible to simulate finite grids. As the neighborhood of points near the edge 
of these finite grids now differs significantly from the neighborhood of the vertices 
in the center, every truncation of the map results in boundary effects. A solution to 
avoid this is to wrap the space, so that each edge is "glued" to the opposite one, i.e. 
if f(z) describes any property of the point z then f(z) = f(z + f) where \r, \ is the 
size of the grid in the ^-direction. In our case, this means, that distances and adap¬ 
tions are always calculated in respect to the nearest representative of this class of 
points. More precisely, in two dimensions we then no longer work in the Euclidean 
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plane but in the quotient space \\i"/Hr = (1R/Z) 2 called flat toms which has non- 
theless a zero curvature. 

Fig. 12 shows examples of finite square boundaries for each of the three tilings 
found above. Due to different symmetries of the square and hexagonal tiling the 
grid size is restricted to even numbers of neurons along each dimension. 



3 


"J 1x1 = INode 
ill' 2x2 = 4 Nodes 
"J 3x3 = 9 Nodes 
4x4 = 16 Nodes 


FIGURE 12: Periodic boundaries for tesselation of Schlaefli symbols (3,6), (4,4) and (6,3) 


3.2 Tilings of the Hyperbolic plane 


The same method, that we used to find an appropriate tiling for the Euclidean 
plane, can be adapted for the Hyperbolic case. It differs only in the fact that we 
have now angular defects to take into account. This softens the restrictions to the 
interior angles, since the sum of the angles of each triangle (of finite size) is now 
strictly smaller than n, which allows infinitely many choices for p and q. Examples 

7 

are 


(triangular) p = 3,q = n n>7=>g + ^<2 
(square) p = A,q = n n>5=>\ + } i < \ 

(hexagonal) p = 6 ,q = n n>4=>g + ^<| 

While therefore the choice of the shape of the tiles and their neighborhood is less 
strict than in the Euclidean plane, the contrary is true for the third parameter, namely 

7 In fact, we will use maps that correspond to these tilings in the numerical analysis. 



























































36 


Chapter 3 Generating Maps 


the length of the edges. According to the theorem of Gauss-Bonnet (cf.[And05],[RR95]), 
the area of a triangle and hence for every polygon is proportional to the angular de¬ 
fect. Since the choice of p and i] determines the defect A by 




> 0 


the length of the edges is also already uniquely determined by them. To deduce 
now the formula for this edge length, we will have to remind the hyperbolic law of 
cosines II (cf.chapter 2.4.3): 


cos( 7 ) = — cos(ai) cos(j 8 ) + sin(a) sin(/3) cosh(c) 


While using the same triangulation of a regular polygon as seen above and regard¬ 
ing one of these resulting triangles, let a, f> and 7 be chosen as above and let c be 
the length of the edge opposite to gamma. Then c is equal to the edge length of the 
polygon. Therefore, we obtain: 


cos(^) +cos 2 (f) 

c = acosh(- P . 2 -—) 

sm z (|) 


Since we need only the half of the length for defining the symmetry group which 
generates the tiling (see 3.1.1) this can be further simplified as follows 


(c/ 2 ) = 


1 /cos(f) + cos 2 (f)' 

-acosh --—-— 

2 V sin if) 


hcosh ( 2 ^ 

2 V ^ 2 (f) 


1 u / cos (y)+ 1 “ sin 2 (f)' 

2 { sin 2 (f) 


— 1 = acosh( 


cos(J). 

sin(f)’ 


Thus, the regular tilings like the three examples shown in fig.13 are completely 
defined. 



FIGURE 13: Regular tesselation of hyperbolic plane (in Poincare model) to Schlaefli symbols (3,7), (4,5) 
and (7,3) 


3.2.1 Generating tilings 

Using a symmetry group defined in the same way as in the Euclidean case, any of 
the infinitely many regular tessellations of the hyperbolic plane can be generated. 
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But, as we have seen right above, the regular Hyperbolic maps are already com¬ 
pletely defined by the Schlaefli number and does not allow a variation of the edge 
length. 

Another significant difference to the Euclidean case becomes obvious if we con¬ 
sider the number of nodes lying in a given circular neighborhood B r in the two- 
dimensional Euclidean and Hyperbolic plane or, to be more exact, its growth rate 
in respect to the radius r of the neighborhood. Since all the regular tiles have the 
same area, the number of nodes multiplied by the area of the faces in the dual tiling 
is approximately proportional to the area of B r . In the Euclidean case, this area is 
given by A EucL (r ) = nr 1 . The area of the Hyperbolic circle on the other side can be 
analogously determined by the following integration 8 9 : 

rr r-2n 

A Hyv '(r) = / / 0sinh(r)d0dr = 27r(cosh(r) — 1) 

Jo Jo 

Thus, this yields an exponential growth rate in respect to the radius for the Hyper¬ 
bolic case while the same rate is only polynomial in the Euclidean plane. Tab.l lists 
now the number of nodes for various Euclidean and Hyperbolic maps and different 
sizes of neighborhoods as an example. It can been seen that the number of nodes 
lying in a circular neighborhood indeed growths nearly by the same factor as the 
area of the neighborhood itself, as claimed above. 


radius 

area 

Euclidean s 
(4,4 ) 4 nn — 1 

Dace 

(3,6 ) 4 nn = 1 

H) 

area 

7 perbol 

(3/7) 

ic spac 
(4/5) 

e 

(6,4) 

V2 

6.28 

9 

7 

7.40 

8 

6 

5 

2 

12.57 

13 

19 

17.36 

15 

11 

5 

2y/2 

25.13 

25 

31 

47.05 

43 

41 

21 

4 

50.26 

49 

61 

165.30 

176 

101 

81 


Table 1: Number of nodes for Euclidean and Hyperbolic maps and different sizes of circular neigh¬ 
borhoods 


3.2.2 Boundaries/Truncation 

Analog to the Euclidean case, we are confronted with the fact that we can only 
use maps with finite size in numerical simulations, but unlike the Euclidean maps, 
periodic boundaries can not be defined here in such an easy manner. Furthermore, 
due to the exponential growth most of the nodes of the Hyperbolic map lie always 
near the edge of a finite map.Two sensible methods of truncating the infinite tiling 
9 to get a finite map are: 

- Truncation at a given maximal distance of the generated nodes to the origin o. 

- Truncation at a given maximal level of expansion i.e. maximal number of ex¬ 
pansions needed to create a node when starting with the origin o 


8 using Hyperbolic spherical coordinates 

9 not to be mistaken for the "truncation" of a tiling! 
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Given a circular area around o, the truncation at the maximal distance ensures the 
maximal number of nodes in this area. So, this method was for example used to 
determine the number of nodes in tab.l for the various hyperbolic maps. 

The second method of truncating at a maximal expansion level, however, results 
in a shape of the map that is, in general, less circular. Fig.14 illustrates this fact by 
comparing the distances of the nodes of different layers for the (3,7)- and the (6,4)- 
map. While the nodes of the lower layers are close together, the nodes of higher 
layers are rather scattered and the layers even overlap. Thus, by truncating at a 
certain level, many nodes that would lie at the same distances from the origin as 
the generates vertices would be neglected. 
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FIGURE 14: Position of nodes/layers in Poincare Disk and distance of nodes to the origin (left) (3,7)- 
map (right) (6,4)-map 

But, on the other hand side, this truncation method ensures, that the map is isotropic 
in respect to its hierarchical structure, and is used in the papers covering (non¬ 
growing) HSOMs (e.g. [Rit99],[ORGG01]). 













Chapter 4 

Sample distributions 


Normally, the samples in the feature space presented to the SOM have an distribu¬ 
tion that reflects the structure of the examined object or networks which is in gen¬ 
eral very complex. The goal here is to analyze the stability of certain configurations 
of SOMs concerning an additional dimension (i.e. input in feature space has one 
more intrinsic dimension than map space. In the following, we will name the di¬ 
mensions except the additional one map dimensions whereas the additional one will 
be called extra dimension) and depending on their type of map and feature space 
configuration, sample sets with particular constructed probability densities will be 
used. To adjust the influence of the extra dimension, the spread of the samples will 
be bounded in this direction and, to ensure the translation invariance of the sample 
distribution in the dimension except the extra one, two equidistant (curved) hy¬ 
perplanes are chosen as the boundaries. In the other dimensions there will also be 
boundaries to limit the sample set volume to keep the probability density normal¬ 
izable. As described in the section about tessellation, the use of periodic boundaries 
can then be used to create at least the illusion of an unbounded sample set in all di¬ 
mensions besides the extra one. Nevertheless should the extra dimension be much 
smaller than any other to suppress boundary effects. 

Since we have already collected all necessary geometrical "ingredients" in a former 
chapter about the geometry of the spaces, we will now concentrate on the defini¬ 
tion of these distributions for the different spaces and models fitting best to their 
inherent structure without imposing any other additional special structures. Thus 
appropriate distributions are uniform or can be obtained by transferring uniform 
structures of the map space onto the feature space. They share all the invariances 
concerning rotations and translations that are intrinsic to the grids in the respective 
map space. Hence the definition of the probability distribution that are needed here 
is quite trivial. They will be shortly introduced in the next paragraph. Much more 
work has to be done to construct and implement these. A detailed view on this will 
be done further down. 
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4.1 Distributions 


Generally speaking, a (continuous) uniform distribution is a class of probability 
distributions such that all subsets with equal volumes are equally probable i.e. each 
infinitesimal volume element dV should have the same probability Knowing the 
metric and thus also the volume element dV allows us to calculate directly the (joint 
or multidimensional) probability density function for each particular model of the 
space we want to use. At this point it should be noted, that the densities that will 
be presented below, still depend on a normalization constant N, that itself depends 
on the support of the functions, i.e. the volume of the set of possible samples. Since 
the densities will be used unnormalized, N will not have to be determined in every 
case. 

Before starting to go through the individual cases, a theorem we will make use of 
has to be mentioned (cf. [Dev86, Thm.4.2]). 

Theorem 4.1.1 (Transformation of (n-dimensional) random variables). Let X have a 
continuous density f on its support set S C R rf and given a transformation which is a 
C 1 -diffeomorphism h : S —> T C R d , i.e. continuous and bijective function such that, if 
g{y) := h~ l (y) : T —> S is the diverse of the transformation, its first partial derivatives 
and therefore its Jacobian matrix J = (^|);y exists and the derivatives are continuous on T. 
Then Y = h(X) has density 


/(s(y))det(J) yeT 

With this theorem and the transformations defined in ch.2 we can convert densities 
given in one model to the corresponding densities for another model. 


4.1.1 Euclidean-Euclidean case 

In the Euclidean space we will use the well-known cartesian coordinate system. 
The corresponding metric tensor is simply the identity. Thus the volume element is 
given by 

dV = dx i • dx 2 . dx n 

Let V be the volume of the compact subset s C IR", in which the samples shall be 
located. Then the joint probability density function of the cartesian coordinates is: 

Q {X 1 ,...,X n ) = l/V =:N 

where X \,..., X n are the random variates representing the cartesian coordinates. 
An illustration of this can be found in fig 15. 

Since the equidistant surfaces are planes in the Euclidean space, the shape of the 
compact subset S will simply be cuboidal (as seen in the figure). 
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FIGURE 15: Euclidean space: (left) illustration of distribution boundaries, densities and voronoi cell 
(right) realization of uniform distribution 


4.1.2 Hyperbolic-Hyperbolic case 

In hyperbolic space (as in any other) the probability density depends on the chosen 
model. We will focus here on the spherical coordinate and Poincare (hyper-)sphere 
model, because both will be used in the following. Again since both map and fea¬ 
ture space are again equal, a uniform distribution will be used. The way to define it 
is analog to the Euclidean case above. Only the volume element and therefore the 
density is more complicated. 

Hyperbolic Spherical Coordinates model Due to the metric defined in Eq. 2.3, the 
volume element is given by: 

dVnsph — (sinh r) 2 sin cpdrdOdcp 

and so we obtain as the joint density function: 

QHSph(R ■/<£/©) = N(sinhr) 2 sin</> 

where V is again the volume of the compact subset S C H 2 , from which the 
samples are drawn. 

Poincare (hyper-)sphere model There are now two ways to get the uniform den¬ 
sity for this model. We can again have a look at the volume element and de¬ 
duce the density by using it analogously to the cases above. But since we are 
going to implement the uniform distribution belonging to this model by just 
using the spherical equivalent, it is more suitable here to derive the density 
also by using the coordination transformation between the two models and 
applying theorem 4.1.1. 

We have already seen in section 2.3.2 that this transformation T H sph^PD is 
defined by Eq.2.6. Thus the reverse transformation for the desired 
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three-dimensional case is: 


r = 2atanh( \J x\ + x\ + x|) 

<p = atar\2(\Jx\ + x\, x^) 

9 = atan2(x2,xi) 


where atan2 is the usual two-argument variant of arctangent. Furthermore is 
the Jacobian matrix of Tsph^PD given by: 


JtiSph 


-s -PD — 


dT Sph ^.p D (r, (p,9 ) 
d(r,(p,9) 


( 1 tanh ( 2 ) s i n (^) cos (g) tanh(j) cos((J)) cos(0) — tanhjg) sin(<J>) sin(0) 
1 tanh (2) sin(fl) tanh(0 cos((p) sin(0) tanhj^) sin(<J>) cos(0) 


V 


1—tanh z ( j) / s 

- 0 2 COS (</>) 


— tanh(£) sin(<J>) 


The Jacobian determinant of the inverse transformation, that will be needed to 
apply the theorem can then simply be obtained by calculating the determinant 
of / above and then by using the fact that the determinant is a multiplicative 
map taking the reciprocal 10 . This yields: 


det (Jmph^PD) 
det (JpD-^HSph) 


, 2/ r x . A ~ tanh 2 ^) 
tanh 2 (-) sin cf> ■ (--- 


tanh 2 (atanh ( x\ + + x 2 )) sin (atan2 ( x\ + x\, x \)) 


(1 — tanh 2 (atanh {\j x\ + x\ + x 2 ))) 


R 2 sin(atan2(y // x^ + x|, x 2 ))(l — R 2 ) 


with R = y ■ x \ + x\ + x 2 = 11 x 11. Now we can finally transfer the spherical 
probability density Qnsph to the Poincare Sphere model: 

QPd(x\,Xi,xA) Thn id' 1 ' 1 QHSph{TpD->HSph{xi, *2, X3)) ■ det (JpD->HSph) 

,, r, tanh 2 (2atanh(R)) 

_ 2(sinh 2 (2atanh(R)) _ Z l-tanh 2 (2atanh(R)) 

R 2 • (1 - R 2 ) R 2 • (1 - R 2 ) 

o (2R/(1+R 2 )) 2 

= , 7 2 1-(2R/(1+r 2 )) 2 = 2 ■ 4R 2 / (1 — R 2 ) 2 _N_ 

R 2 • (1 - R 2 ) R 2 • (1 - R 2 ) (1 - R 2 ) 3 

=► QPD(x 1,X2,X 3 ) = - - ^~ 2 TZ 

(1- Ml ) 3 


Unlike in the Euclidean case the shape of the sample set as shown in the Fig. 16 is not 
so trivial, since the equidistant curves and surfaces (cf. section 2.4.4) are no longer 


10 We skip at this point to specify explicitly the Jacobian of the inverse transformation, but its existence 
follows directly by the non-vanishing determinant and thus the existence of the reciprocal. 
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lines and planes in the hyperbolic space. To avoid boundary effects, the "lateral" 
boundary is shaped in a way to preserve the voronoi cells of the outermost nodes, 
i.e. is defined by the geodesics at each point of the edge of the disk such that they 
are perpendicular to it. 



X 



r 


- 



FIGURE 16: Hyperbolic space / Poincare Sphere: (left) illustration of distribution boundaries, densi¬ 
ties and voronoi cells (right) realization of uniform distribution 


4.1.3 Hyperbolic-Euclidean case 

This is the first case where the map space is different than the feature space. It is not 
longer enough to just use a uniform distribution in the feature space, because this 
would result in the lost of the hierarchical structure that we have in the map space 
and therefore in the grid of neurons. To conserve this, we adapt the probability 
density by using the volume element of the map space. Only for the extra dimension 
a uniform Euclidean distribution is used. Thus by embedding the Poincare Disk 
model into our Euclidean feature space we get: 


dV = dVpjj ■ dspiid = (1 — IM| 2 ) ^dx\... dx n 

In other words, we present the SOM in the map dimensions a hierarchical structure 
and an uniform Euclidean scattering in the extra dimension as we would obtain by 
statistical fluctuations around a "hierarchical object". The shape is therefore as seen 
in fig. 17 a "fat" disk. 
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z 

4 


i> X 





FIGURE 17: Euclidean space: (left) illustration of distribution boundaries, densities and voronoi cells 
(right) realization of distribution 


4.2 ^Generating random variates 

Now we have to discuss how to generate the random variates defined by a partic¬ 
ular density function. Referring to Luc Devroye's "bible" for non-uniform number 
generation [Dev86] we will present two important principles. In order to do this 
as short as possible, we will skip the proofs of the thereby used theorems, but the 
interested reader may find them in the appendix A.4. We will further assume that 
the reader has knowledge of the basic vocabulary of probability theory. 


4.2.1 ^Inversion principle 

This method is based upon the property given by [Dev86, thm.II.2.1] 

Theorem 4.2.1. Let F be a continuous cumulative distribution function (CDF) on Kzvith 
Inverse F“ x defined by 

F _1 (n) = inf{x E ]R|F(x) = u} 

If U is a uniform random variable in [0,1], then F -1 (lf) has distribution function F. Also, 
ifX has distribution function F, then F(X) is uniformly distributed on [0,1]. 

Using this theorem and given an arbitrary continuous CDF F which has an explic¬ 
itly known inverse, a random variate with this distribution function can be gener¬ 
ated by the following simple algorithm: 

Corollary 4.2.2 (Inversion method). Random variates distributed with CDF F can be 
obtained as follows:: 

Generate a uniform [0,1] random variate U. 

RETURN F _1 (U) 

(The proof of this corollary is trivial and will be omitted) 
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4.2.2 ^Rejection method 

The requirement of having a CDF with known inverse is often hard to fulfil. So even 
if the inverse is well-defined, there may often be no analytical solution of F(x) = u 
and therefore it can difficult to compute the inverse fast and accurate. One way to 
fix this, is to use a numerical solution of F(x) = u. This would lead to an unavoid¬ 
able trade-off between computation time and accuracy. Instead we will make use of 
the rejection principle. This method allows to create accurate random variates at the 
cost of having to reject some generated ones as will be shown below. But first two 
theorems. 

Theorem 4.2.3. 1) Let X = (Xi, ... ,X d ) be a collection ofd independent and identically- 

distributed (iid) random variates i.e. a random vector with joint density f on R rf and 
let U be an independent uniform [0,1] random variate. Then ( X,cllf(x )) is uni¬ 
formly distributed on A = {(x,u)\x E [R (i ,0 < u < cf(x)} where c > 0 is an 
arbitrary constant. 

2) Vice versa, if (X, If) is a random vector in R‘ ?+1 uniformly distributed on A, then X 
has density f on R rf . 

Theorem 4.2.4. Let Xi,X 2 , ... be a sequence of iid random vectors taking values in R rf , 
and let A C R d be a Borel set such that P(X i E A) = p > 0. Let Y be the first X; taking 
values in A. Then Y has a distribution that is determined by 

PIY€B) = LEi±Am 

p 

where B is another Borel set in R rf . In particular, ifX\ is uniformly distributed in Ao, where 
Aq D A, them Y is uniformly distributed in A. 


Combining both theorems we get the following: 

Theorem 4.2.5 (Rejection method). Given a density function f let g be a function and 
c < 1 be a constant such that 

f(x) < c ■ g{x) Vx 

. Then the following algorithm generates random variates with density f: 

REPEAT 


Generate X with density g and U uniformly distributed on [0,1] 


set T 


g(X) 


/(X) 

UNTIL U ■ T < 1 


RETURN X 


In other words, three things are required here: 

(i) a (good) dominating density g 

(ii) a simple (and efficient) method for generating random variates with density 

8 
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(iii) knowledge of c 

Although g and c can be easily found by an analytical analysis of /, it's quite impor¬ 
tant to choose g wisely, because we have just exchanged our problem of generating 
random variates using the density / with same problem with using g instead. When 
we are going to construct the generators for the sample sets below, we will thus 
choose g such that one the hand its CDF is easily invertible, such that the inversion 
method can be used, and, on the other hand, it will be close enough to / so that the 
rejection rate is kept low to maintain the efficiency 


4.2.3 ^Euclidean-Euclidean case 

This case is simple if it is done by using the Euclidean model of cartesian coor¬ 
dinates. First, we assume that the sample set is centered in the origin. Then the 
equidistant surfaces i.e. the boundaries in the extra dimension are given by x 3 = ±s 
where 2s is the distance of the opposite boundaries. The boundaries in the map di¬ 
mension are obtained in the same manner. Hence the choice of each coordinate is 
independent of the choice of the other ones. Thus we get our uniform distributed 
sample set by combining the realizations of three iid random variates to a vector, 
i.e. let Xi : Q -A]X™ n ,X™ x ],X 2 : Q -t]X”" n ,X™*] and X 3 : Q ^]X™",X™“] be 
three uniform idd random variates. Then the resulting three-dimensional random 
variate X : Q —> S is obtained by: 


X(-) = (X 1 (.),X 2 (.),X 3 (-)) T 


4.2.4 ^Hyperbolic-Hyperbolic case 

The hyperbolic-hyperbolic case is the most complex one of the three. To simplify the 
problem we will work in the hyperbolic spherical model, because, as we have seen 
above, the probability density Qnspli only depends on one of the three dimensions, 
namely r. Furthermore we will only take care of the case that the sample set is less 
extended in the extra dimension than in the map dimensions (This is not a severe 
restriction since we can simply enlarge the map arbitrarily). We then transfer our 
results to the Poincare Disk by transforming the resulting realizations. 

Regardless of the beneficial form of the probability density the choice of the radial 
component r, the polar angle cp and the azimuth 0 still depend on each other due to 
the shape of the sample set. We will therefore fix r and have a look at the respective 
"shells" of the sample set. The propability that a sample has a certain distance r 
from the origin is then given by the "relative mass" of the shell where the measure is 
determined by the probability density and thus the volume element as seen above 11 . 
Fet us assume that we have the two-dimensional planar disk defined by p = 0 and 
r < R. Then the boundaries in the extra dimension are the equidistant surfaces 
above and beneath this plane at the distant s. As noted before, we will only regard 
the case where s < R. The other boundaries are given as already defined in section 

11 As announced we will hereby omit the normalization constant N. Thus actually we do not use the 
infinitesimal mass of the shell, but the "relative mass" which is the (finite) product of the mass of 
the shell and the total mass of the bounded volume of the sample set and therefore the density 
and cumulative distribution functions are normalizable but not normalized. 
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4.1.2. Fig.18 shows the three cases that may occur for the shells where R max is the 
maximal distance between the origin and the points in the volume. It can obtained 
by using the law of cosines in the right triangles with the side length R ma x, R arid 
s: 


Rmax = arccos(cosh(s) cosh(R)) 



FIGURE 18: 3 cases for shells corresponding to fixed r (projected onto plane 9 = 0 V 9 = n) 

If r is smaller than s the "relative mass" is simply the area of the surface of the 
three-dimensional sphere, i.e. 

rTC rT. 71 rTZ r27Z 

pi(r) = / / dV = / / (sinhr) 2 sincpdrdOdcp = 4nsinh 2 r 

Jo Jo Jo Jo 

The CDF is then obtained by taking the volume of the sphere that is 

F(r ) = / 4Tzsmh 2 (f)dr = 27r(sinh(r) cosh(r) — 1) 

Jo 

The constant summand unfortunately causes a problem if we would try to invert 
this CDF to apply the inversion method. We will see how to approach this when we 
are going to deal with the overall density function. 

When r exceeds s the surface is not longer the one of the whole sphere but restricted 
to the component between the two equidistant surfaces. Thus we have to determine 
the bounds of the polar angle <p corresponding to this restriction. We get these val¬ 
ues by using the equation, which we already derived for the equidistant curves (cf. 
Eq.2.22). The only thing we have to remind is, that we have to substitute the 9 which 
was originally used by n — (p. Hence we get: 


.(r) 

c(r) 


n . ( sinh(s) \ 

2 V smh 0) / 

tc . ( sinh(s) \ 

2 V smh ( r ) / 


= arccos 


sin 


h(s)\ 


= n — arccos 


sinh(r) J 
sinh(s) 
sinh(r) 
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Thus we obtain for the (non-normalized) density in this case: 


Pi(r) 




rn —arccos 


sinh(s) \ 
sinh(r) J 


sinh(s) \ 
sinh(r) ) 


4n sinh(s) sinh(r) 



(sinh r) 2 sin pdrdOdp 


In the case that r is now larger than R we get an additional restriction for the polar 
angle as we can see in Fig. 19. is equal to <p mm of the preceding case whereas 



dower 


FIGURE 19: Case r > R (projected onto plane 6 = 0 V 9 = n) 

(not shown in diagram) equals (p max . The remaining two angles can be deter¬ 


mined by regarding the yellow right triangle. The law of cosines yields 


cosh(r) = cosh(s) cosh(R) 


s = acosh 




cosh(r) ^ 

m)J 


cos 


Combining this result with the law of sines we obtain 


smh(s) , .uppers sinh(s) 

sinh(r) = - upper. ^ cos(tf ) = — w-T 

COS ((pmax ) Smh(f) 


,upper 

(pmax = arccos 


sinh (acosh 


cosh(r) \ 
cosh(R) J 


) 


sinh(r) 


= arccos 


y cosh 2 (acosh 

( cosh(r) N 
^cosh(R )) 


= arccos 

N 


sinh(r) 


/ / cosh 2 (r) ~T\ 

Y cosh 2 (R) 


sinh(r) 


V 
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Analogously follows 


( / cosh 2 (r) _ 


dower 


= tc — arccos 


cosh 2 (R) 


V 


sinh(r) 


Thus we obtain for the density: 


P3 0 ) = 


f<t>'mfx er {r) f2n r'Pmax' (?) r2n 

/ dV+ dV 

. <p u ti er (r) Jo J(p ,ower (r) Jo 

rmin \ / ^ t mm v / 

/ / ! 2 / 


= tosinhWIsinhW^y^d-l 


This function has a null at R m ax as can be checked easily 


Now we can define the overall density function: 


p(r) 


pi(r) if 0 < r < s 
(02 (r) if s < r < R 
p 3 (r) if R < r < R, 
0 otherwise 


It is worth mentioning that by construction p(r) is continuous and piece-wise even 
smooth. Since it has a compact support, the CDF exists. Unfortunately due to the 
part of the third case it is very hard to determine it analytically, let alone to invert 
it afterwards. Thus we cannot use the inversion method and we will have to use 
the rejection method instead. Thereby we first have to find a suitable dominating 
density function g. We will do this for each part individually. 


In the interval [0, s] the density is convex. This can be seen by differentiating it 
two times with respect to r. Thus, a good dominating approximation is the linear 
approximation and, by using sufficient many sampling points, the approximation 
error is small. Furthermore can the corresponding CDF F\ (r) easily be inverted as 
it is quadratic in each segment between the sampling points. 


For the second interval i.e. r G [s, R] the CDF can be obtained by using the orig¬ 
inal density function: 


= Fi(s) + J 47rsinh(s)sinh(f)df 
= Fi(s) +47rsinh(s)(cosh(r) — cosh(s)) 
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The inverse is then given by: 




h(r)-F i(s) 


4n sinh(s) 
r =: f 2 ~ 1 (^) = acosh 


= cosh(r) — cosh(s) 
X-Fi(s) 


47rsinh(s) 


■cosh(s) 


The density function in the third interval has a convex and a concave part. For some 
choices R and s it may not even be monotonic. We will nevertheless use the same 
approach as in the first interval, because even if the density may be therefore not 
dominating for all points the error is sufficiently small, if enough sampling points 
are chosen, since the original function is continuous. Using the rejection method, 
this will lead to a modelled distribution whose density is given by min(p(jc),g(jc)). 
So, if the linear approximation is already good, the error in the modelled distribu¬ 
tion is small as well. 

Fig.20 shows the graph of the density function p(r) using R = 2 and s = 1 and a 
rough approximation using 5 sampling point for each linear approximation. To get 



FIGURE 20: example for the original and dominating density g 

the wanted distribution for the random variate X r , belonging to r, we now com¬ 
pute the CDF of the (mostly) dominating density g. As already shown, or rather, 
mentioned above, this CDF can be inverted and thus the inversion method can be 
applied. The random variate thus obtained can then be used in the rejection method 
to finally get X r . Due to the isotropy of the shells (restricted to the bounds of (p) the 
random variates 


X# ; £ -> [(p m in{r),(pmax(r)] or [^(r)^ST(r)] U W™n ( r )><PmZ r (?)] 
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and 


[0,2/r] 

belonging to the two angles are independent and uniformly distributed. To retrieve 
the according distribution in the Poincare Sphere model we just have to convert the 
realization of the radius according to the known transformation. 


4.2.5 ^Hyperbolic-Euclidean case 

As noted in the definition, this distribution is just a product of the two-dimensional 
hyperbolic uniform distribution that is embedded in the Euclidean space and a one¬ 
dimensional Euclidean uniform one. For the first one, we use the same approach as 
in the Hyperbolic-Hyperbolic case above. We determine the CDF of the random 
variate X r : 


F(r) = 


r fin 


/0 Jo 

The inverse is then given by: 


sinh(f)d0df = 27rsinh(r) 


F x (X) = asinh 




2n J 


Thus we can use the inversion method. The random variate X# : Q —> [0, 2 tt] is due 
to the isotropy again independent and uniformly distributed. 

For the latter distribution we get the uniform distributed random variate X z : Q —> 
[—s, s] where 2s is the distance between the boundaries in the extra dimension. 
The wanted three-dimensional random variate is then obtained by combining these 
three in cylindrical coordinates and then converting the result to the cartesian coor¬ 
dinate system. 




Chapter 5 

Implementation of a General Riemannian 
SOM 


Besides an analytic approach, we intend to perform the stability analysis of the clas¬ 
sic SOM and the GRiSOM by using computer-based numerical means. We therefore 
need a computer program that simulates the GRiSOM and furthermore allows us 
to study its behavior in respect to the stability In the course of the work on this the¬ 
sis a whole software library has been written to meet this needs. Using C++ as the 
programming language, we took a special emphasis on the modularity, universality 
and (re)useablity of the implementation. 

Modularity has been achieved by using interfaces and abstract classes and avoiding 
cross-dependencies as far as possible. This has been visualized in Fig.46 in the Ap¬ 
pendix. It illustrates all dependencies between the header files of the project (and 
some extern ones). As it can be seen, the only cross-dependencies are between the 
headers defining templates and the corresponding files that implement their spe¬ 
cialization. This is unfortunately unavoidable due to the improper implementation 
of the C++ language standard in most of the compilers. 

The universality has been realized by using templates and specializations. This al¬ 
lows to design the software in a manner that is close to the mathematical structures. 
Thus e.g. points and spaces can be handled as abstract objects. Only if we want to 
use a particular representation of these mathematical objects, we have to specialize 
the templates to work with the corresponding implementations. 

The third goal was finally approached, besides the already listed features of the 
other two points, by adding certain structures for convenience purposes. This in¬ 
cludes e.g. wrapping structures of the SOM to allow to handle and even implement 
analyzers and classes for the automatization of the process of the numerical anal¬ 
ysis easily. The software itself has been separated in three parts. Two of them are 
libraries, namely the Core and the Auxiliary library, that can be used by linking them 
statically or dynamically to other projects. The third part consists of the tester pro¬ 
gram that can then be used to obtain the numerical results in the next chapter. 

We will now take a look at each of these parts, but before we continue, it should 
be noted that that this chapter does not intend to presents a full API of the soft¬ 
ware library but rather shall show how all the more interesting aspects discussed 
in the preceding chapters have been designed and implemented in the software 
that will be used in the upcoming numerical analysis. Thus the reader should be 
aware, that also inside the listings, sections as e.g. most of the comments may have 
been skipped and only but a few class diagrams are presented where they ease the 
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understanding. However, a full documention of the package can be found in an ad¬ 
ditional document. The chapter B in the appendix provides more information about 
this. 


5.1 Core library [libgsom_core. {a,so}] 

The core library bundles all the classes that are at the least needed to provide the 
most basic implementation of the generalized SOM. 


5.1.1 Required basic structures [topology. {h, cpp}] 

Back in chapter 1 we have presented the generalization of the SOM with more gen¬ 
eral types of map space and feature space. To take this into account we have define 
these spaces in our implementation: 

templatecclass PointType> 
class Space 
{ 

public : 

virtual Space<PointType >* clone () const; 

}; 

templatecclass PointType> 

class Metric_Space : public virtual Space<PointType> 

{ 

public : 

Metric_Space ( const typename metriccPointType >:: f& dist); 
virtual Metric_SpacecPointType>* clone () const; 
const typename metric cPointType >:: f distance; 

}; 

templatecclass PointType> 

class Geometric_Space : public Metric_SpacecPointType> 

{ 

public : 

Geometric_Space ( const Metric_SpacecPointType>& metric_space , 

const typename geodesic_line_segment cPointType >:: f& line); 

Geometric_Space ( const typename metric cPointType >:: f& dist, 

const typename geodesic_line_segment cPointType >:: f& line); 
virtual Geometric_SpacecPointType>* clone () const; 
const typename geodesic_line_segment cPointType >:: f line; 


Listing 1: topology.h - definition of space 

According to the axiomatic construction of the spaces, we have not given any fur¬ 
ther definition of a point. Therefore also in the implementation of the SOM, a point 
can be anything by strictly using it as a template parameter. But since we want to 
do calculations in certain models, three possible types of point have been picked. 


class Point 

{ 

public : 

virtual Point* clone () const = 0; 

virtual bool operator ==(const Point&) const = 0; 

}; 

class Simple_Point : public Point 

{ 

public : 

Simple_Point ( int _id=0); 
virtual Simple_Point* clone () const; 
virtual bool operator ==(const Point& p) const; 
int get_id() const; 
protected : 

/*! \brief identifier of point */ 
int id; 

}; 

class Coordinates : public Point 

{ 
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public : 

Coordinates (uint dim=0, double* coords=0); 

Coordinates (const Coordinates& c); 
virtual -Coordinates (); 

virtual bool operator ==(const Point& p) const ; 
virtual void operator=(const Coordinates& c); 
virtual Coordinates* clone () const; 
double operator []( const uint& index) const; 
void set (const uint& index, const double& new_value); 
uint get_dim() const; 
protected : 
uint dim; 
double* coords; 


LISTING 2: topology.h - definition of point types 

When implementing the particular spaces for the simulations, we will confine our¬ 
selves to using the (infinite-dimensional) Coordinate model as each of the defined 
models of the Euclidean, Hyperbolic and Spherical space make use of coordinate 
systems to describe the location of points. Simple_Point is mostly used in combi¬ 
nation with the trivial metric space to provide a simple map space for the vector 
quantization. Nonetheless, it would be possible, since we only need to distinguish 
between a finite number of points in the map spaces (namely the fixed positions 
of the neurons), to use Simple_Point in a more complex map space by explicitly 
defining all the distances between the element of the finite set. 


5.1.2 Generalized SOM algorithm [som. {h, cpp}] 

The next step is to implement a formal neuron. As defined in section 1.2.1, its just 
a pair of two points where one of them lies in the map space, the other one in the 
feature space. 

I- 

| templatecclass MapPointType, class FeatPointType> 
class Neuron 
{ 

public : 

Neuron( const MapPointType& map_point, 

const FeatPointType& init_feat_point); 

Neuron(const Neuron& n); 

-Neuron () ; 

void operator =(const Neuron& n); 

void set_feat_point ( const FeatPointType& new_feat_point); 

MapPointType map_point; 

FeatPointType feat_point; 


LISTING 3: som.h - definition of neuron 

The last missing "ingredient" is the implementation of a neighborhood function. To 
keep it still most general it is simply defined in the SOM class as a function (object) 
getting a distance represented by a <double> and returning its result in form of 
another <double>. 

typedef function <double( const double&,const double&)> neighborhood_function; 

Listing 4: som.h - definition of neighborhood function 

The concrete implementations of the particular neighborhood functions are thereby 
not part of the core library as they are not necessarily needed here. 


This is already everything that we need to implement the SOM now. 
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template<class MapPointType, class FeatPointType> 
class SQM 

{ 

public : 

typedef typename metric<MapPointType >:: f dist_map_type; 
typedef typename metric<FeatPointType >:: f dist_feat_type ; 

typedef typename geodesic_line_segment<FeatPointType >:: f geodesic_feat_line_type ; 
typedef Neuron<MapPointType, FeatPointType> neuron_type; 
typedef vector<neuron_type> som_net_type; 

typedef function <double ( const double&, const double&)> neighborhood_function; 
typedef Metric_Space<MapPointType> map_space_type; 
typedef Geometric_Space<FeatPointType> feat_space_type; 

SQM( som_net_type &neurons, 

const map_space_type& map_space, 
const feat_space_type& feat_space , 
const neighborhood_function& h 
); 

SOM( som_net_type &neurons , 

const dist_map_type& dist_map , 
const dist_feat_type& dist_feat , 
const geodesic_feat_line_type& line_feat , 
const neighborhood_function& h 
); 

SQM( const SOMkMapPointType , FeatPointType>& som); 
void print_state (ostream& stream = std::cout) const; 
const som_net_type& get_state() const; 

virtual Neuron<MapPointType, FeatPointType> adapt (const FeatPointType& input, 

const double& epsilon , 
const double& sigma); 

virtual ~9QM() ; 

/*! \brief working set of neurons */ 
som_net_type _neurons; 

/*! \brief map space — defining distances between map points */ 
const map_space_type* _map_space; 

/*! \brief feature space — defining distances and translations between map points */ 
const feat_space_type * _feat_space ; 

/*! \brief neighborhood function used in the adaption procedure */ 
const neighborhood_function _h; 

private : 

neuron_type* determine_winner( const FeatPointType& input) ; 

±_J 

Listing 5: som.h - definition of SOM 

As we can see here, each instance of SOM has its own map and feature space, neigh¬ 
borhood function and set of neurons to work with. The adaption parameters £ and 
cr will be then passed for each adaption step. At last we will have a closer look at the 
two important functions in the algorithm. The first one is to determine the neuron, 
in whose feature set or voronoi cell the given sample lies. Since we have emulated 
all the mathematical structures needed in the formal algorithm, the implementation 
is quite straightforward. 


export templatecclass MapPointType, class FeatPointType> 

Neuron<MapPointType , FeatPointType > SOMcMapPointType, FeatPointType >:: adapt ( const FeatPointType& input, const 
double& epsilon , const double& sigma) 

{ 

neuron_type* winner = determine_winner (input); 

FeatPointType winner_pos = winner—>feat_point; 

for (typename som_net_type :: iter ator it = _neurons . begin (); it!= _neurons . end () ; it++) 

i 

// move feat_point towards input 

double coeff = epsilon * _h(_map_space—>distance ((* it) .map_point, (* winner) .map_point ,0) ,sigma); 
if (coeff) // only if sth is to do 
{ 

it —>set_feat_point (_feat_space —>line (it —>feat_point , input, coeff)); 


return Neuron<MapPointType, FeatPointType >(winner—>map_point, winner_pos); 

i 

export template<class MapPointType, class FeatPointType> 
Neuron<MapPointType, FeatPointType >* SOMcMapPointType, FeatPointType > 

:: determine_winner(const FeatPointType& input) 

{ 

neuron_type* winner = 0; 
double winning_dist = 0; 
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for (typename som_net_type :: iterator it = _neurons . begin () ; it!= _neurons . end () ; it++) 


26 


28 


double actual_dist = _feat_space —>distance ((* it) . feat_point , input, winning_dist); 
if( ! winner II actual_dist <winning_dist) 


30 


winning_dist=actual_dist; 
winner = &(* it); 


32 


34 return winner; 


LISTING 6: som_template.cpp - implementation of adaption and determination of winner neuron 


5.2 Auxiliary Library [libgsom_aux. {a, so}] 

While the Core library, as seen above, only provides the most basic implementa¬ 
tion of the GRiSOM, the Auxiliary Library now provides the means needed to 
work seriously with it. This includes so-called Factory classes to create certain spaces 
(Ch.5.2.1), tessellations (Ch.5.2.2), distributions (Ch.5.2.3) and finally the particular 
SOMs in whole (Ch.5.2.4). Furthermore tools to analyze the behavior of the SOM 
are made available (Ch.5.2.5). 

5.2.1 *Space Factory [space_f actory. {h, cpp}] 

Regarding its dependencies the Space Factory would have to be discussed further 
down as it specializes structures that are abstractly defined in sections still follow¬ 
ing below. That means that, following the design decisions mentioned in the intro¬ 
duction of this chapter, e.g. some types of spaces are not defined until in the evalu¬ 
ation section as they will be mainly used there, but will be inherited by the concrete 
space models that have been implemented in the Space factory. We will nonetheless 
start with the look on the Space Factory as it reflects better the chronologic order of 
the approach of the implemention of the GRiSOM (since the implemented models 
are used to construct the particular SOMs that we want to simulate even without 
any evaluation of the results). 

In Chap.2 we discussed in detail the geometrical properties of the particular spaces 
and their models which we will use in the stability analysis. We now have to trans¬ 
fer them into the implementation. Beside the necessary structures of the metric and 
geometric space, our implemented spaces inherit three additional spaces namely 
Vector_Space, Pre_Hilbert_Space and Projectable_Space which on their part 
have structures that will be needed for certain Analyzers (cf. Ch.5.2.5). They pro¬ 
vide e.g. a sense of addition, scalar multiplication, (canonical) inner products or 
projections in some of our models. A further discussion about them will be delayed 
until we will have to use them in the section about the evaluations. The complete 
definition of our models are now: 


1 

2 


templatecclass PointType> 

class Euclidean_Cart_Coord_Space : public Geometric_Space<PointType>, 


4 


public Pre_Hilbert_Space<PointType>, 
public Projectable_Space <PointType> 


6 public: 


typedef Isometry <PointType , Euclidean_Cart_Coord_Space<PointType> > Isometry; 
8 I virtual Euclidean_Cart_Coord_Space<PointType>* clone () const; 
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private : 

friend class Space_Factory <PointType, Euclidean_Cart_Coord_Space<PointType> >; 

Euclidean_Cart_Coord_Space ( const Geometric_Space<PointType>& gs , 

const Pre_Hilbert_Space <PointType>& phs, 
const Projectable_Space <PointType>& ps 
); 

); 

templatecclass PointType> 

class Hyperbolic_Disk_Space : public Geometric_Space<PointType >, 
public Projectable_Space <PointType> 


public : 

typedef Isometry <PointType, Hyperbolic_Disk_Space<PointType> > Isometry; 
virtual Hyperbolic_Disk_Space<PointType>* clone () const; 

private : 

friend class Space_Factory <PointType, Hyperbolic_Disk_Space<PointType> >; 
Hyperbolic_Disk_Space ( const Geometric_Space<PointType>& gs , 
const Projectable_Space <PointType>& ps 
); 

); 

templatecclass PointType> 

class Spherical_Geo_Space : public Geometric_Space<PointType> 

{ 

public : 

virtual Spherical_Geo_SpacecPointType>* clone () const; 
private : 

friend class Space_Factory cPointType , Spherical_Geo_Space<PointType> >; 
Spherical_Geo_Space( const Geometric_SpacecPointType>& gs 


Listing 7: space_factory.h - definition of several spaces and models 


To avoid confusions due to the multiple inheritance, the class diagram in Fig.44 il¬ 
lustrates the relations between these special spaces and the overall structure. All 
methods, that can used to get instances of these special spaces, are now bundled 
in the template Space_Factory using a specialization of it for each model. The con¬ 
structors of the space classes itself are defined as private to assure the consistency 
by restricting the possibility to create instances to the particular specialized fac¬ 
tory class. Henceforth we will focus the further discussion to two models, namely 
the Poincare Disk model embedded in the complex plane and the Poincare (hy¬ 
per) sphere model embedded in the IK‘ / . All other spaces and models have been 
analogously implemented. 

Since the Poincare disk space is a subclass of both the geometric and the pro- 
jectable space, it has to provide all their functions i.e. the "metric" and the "geodesic 
line segment" of the geometric space and the "canonical projection" of the pro- 
jectable space. The 'projections' thereby map the points in the space onto a pre¬ 
defined subset. In our case it is just a less dimensional subspace and the projected 
point is the point in the subspace that is closest to the one that we project. In our 
implementation we then obtain the result of the projection by simply reflecting the 
point in this subspace and then computing the point on the geodesic in the middle 
between these two points. 

We get now the following specialization of the space factory template: 

1 j template <> 

2 class Space_Factory <Coordinates , Hyperbolic_Disk_Space<Coordinates> > 

| { 

4 I public: 

static metric<Coordinates >:: f create_metric (); 

6 static geodesic_line_segment<Coordinates >:: f create_geodesic_line_segment () ; 

static function<Coordinates (const Coordinates&, const uint&)> create_canonical_projection () ; 

8 static Hyperbolic_Disk_Space<Coordinates> create_space () ; 
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private : I 

static double metrics ( const Coordinates& cl, const Coordinates& c2, const double& cutoff =0); 

static Coordinates geodesic_line_segments ( const Coordinates& cl, const Coordinates& c2, const double& param); I 
static Coordinates canonical_projection ( const Coordinates& c, const uint& dim); | 

1_J 

LISTING 8: space_factory.h - specialization of space factory 

The private methods define the needed functions or function objects while the cor¬ 
responding public creation method bind the passed parameters to these functions 
before returning them. This allows to have e.g. a boundary condition as an inher¬ 
ent attribute of the metric or geodesic. The implementation of the distance function 
thereby simply uses Eq.2.9: 
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For the implementation of the geodesics (and therefore also for the projection) we 
will have to discuss another structure, we make use of, namely the isometries. 


| double Space_Factory cCoordinates , Hyperbolic_Disk_Space<Coordinates> >:: metrics ( const Coordinates& cl, 

const Coordinates& c2, 
const double& cutoff) 

{ 

double norm_lsq =0; 
double norm_2sq =0; 
double norm_diffsq =0; 

uint dim = max(cl . get_dim () ,c2 . get_dim ()); 

for (uint j=0; j<dim; j++) 

{ 

double diff=c2 [ j ]-cl [ j ]; 

norm_lsq += pow( cl [ j ] ,2) ; 
norm_2sq += pow( c2 [ j ] ,2) ; 
norm_diffsq += pow(diff ,2) ; 

} 

return acosh(l + 2*norm_diffsq/((l — norm_lsq) *(1 — norm_2sq))); 


Listing 9: space_factory.cpp - implementation of creating distance function 
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^Isometries [isometry.h] 

The definition and implementation of isometric transformations in the library serve 
several purposes. First of all, as concluded in chapter 3.1.1, we can use a subgroup 
of them to generate the regular tessellations of the space. Furthermore, referring 
to section 2.4.2, they provide a more comfortable way to compute certain geomet¬ 
ric structures like the geodesics. In Listing 10 we see the generic definition of this 
concept in the source code. 
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| templatecclass PointType, class SpaceType> | 

class Isometry I 

{ | 
public: I 

virtual PointType apply(const PointType& p, double* error=0) const; 

virtual Isometry <PointType , SpaceType> operator *( const Isometry <PointType , SpaceType>& T) const ; I 

virtual bool operator ==(const Isometry<PointType ,SpaceType>& T) const; 

virtual bool apply_equal( const Isometry<PointType ,SpaceType>& T, const PointType& p) const; I 

virtual Isometry <PointType ,SpaceType> inverse () const; 

virtual double get_error ( const PointType& p) const; I 

static Isometry<PointType ,SpaceType> Identity (); 

__J 

LISTING 10: isometryh - generic definition of isometries 


Besides the group operation and a way to compare, invert and apply the isometries, 
we added two more functions. The first one, apply_equal(...), provides a way to 
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determine, if the isometry object and another given one map a given point to the 
same new point. get_error () however allows to return the intern numerical error 
of the isometry. This error occurs in any numerical computation and determines the 
precision of the results. Thus, when we get two points which are results of transfor¬ 
mations, they may represent the same analytical result if they differ by at least by 
this error margin. This has been taken into consideration when generating the tes¬ 
sellations. Nonetheless does this generic definition and its implementation provide 
the structure of the trivial isometry group i.e. the one only containing the Identity. 


As an example for a specialization of the isometries, we will once again take the Hy¬ 
perbolic Poincare (hyper-)sphere space with complex numbers as representations of 
points. We have already done all the needed preparatory work to determine the for¬ 
mulas for the operations, as for inverting or combining, in the process of proving 
Thm.2.4.1. So again the implementation is straightforward. 


Complex Isometry cComplex, Hyperbolic_Disk_Space<Complex> >:: apply (const Complex& z, double* error) const 


{ 

Prec_Complex result((_p + (_r *( _inv?conj (z) : z))) / ((conj (_p) *_r *(_inv?conj (z) : z)) + 1)); // (r*z + p)/(conj(p) 
*r*z +1) resp . (r*conj(z) + p)/(conj (p)*r*conj (z) +1) 
return Complex( real ( result) ,imag( result)); 

} 


Isometry cComplex, Hyperbolic_Disk_Space<Complex> > Isometry <Complex, Hyperbolic_Disk_Space<Complex> >:: operator 
*( const Isometry cComplex, Hyperbolic_Disk_Space<Complex> >& T) const 

{ 

Prec_Complex denom = _r *(_inv? conj (T._p) :T._p) * conj (_p) + 1; 

Prec_Complex R = (_r *( _inv?conj (T. _r) :T. _r )+_p *( _inv ?(conj (T._r)*T._p) : (T. _r * conj (T. _p)))) /denom; 

Prec_Complex P = (_r *(_inv?conj (T._p) :T._p) +_p)/denom; 

R = normalize (R) ; 

return Isometry cComplex, Hyperbolic_Disk_Space<Complex> >(R,P,!(_inv ==T._inv) ) ; 

i 

Isometry cComplex, Hyperbolic_Disk_Space<Complex> > Isometry <Complex, Hyperbolic_Disk_Space<Complex> >:: inverse () 

{ 

return (Isometry cComplex, Hyperbolic_Disk_Space<Complex> >(Complex(l) ,0,_inv) 

* (Isometry<Complex,Hyperbolic_Disk_Space<Complex> >(conj(_r)) 

* Isometry<Complex,Hyperbolic_Disk_Space<Complex> >(1 ,_p* —1))); 


J 


LISTING 11: space_factory.cpp - specializations of isometry (implementation) 


All the specializations for the other spaces and models can be again obtained in an 
analogous manner. For this case we can now finally transfer our results of section 
2.4.2 to implement the function of the geodesics. 


Coordinates Space_Factory cCoordinates , Hyperbolic_Disk_Space<Coordinates> > 

:: geodesic_line_segments (const Coordinates& cl, const Coordinates& c2 , 
const double& param) 

{ 

if(param==0) return Coordinates (cl); 
if(param==l) return Coordinates (c2); 

uint dim = max(cl. get_dim () ,c2 . get_dim ()); 

// using variation of Alguiera—Perez Algorithm i.e. rotating into a 2 dim disk and then working there 

Isometry cCoordinates , Hyperbolic_Disk_Space<Coordinates> > T = Isometry cCoordinates , Hyperbolic_Disk_Space< 
Coordinates> >:: Identity () ; 

// rotate cl into X_n axis 

Coordinates pi = cl; 

for (uint i=0; icdim—l;i++) 

i 

Isometry cCoordinates , Hyperbolic_Disk_Space<Coordinates> >M 

= Isometry cCoordinates , Hyperbolic_Disk_Space cCoordinates > >:: Pure_Main_Rotation (atan2 (pi [ i ] ,pl [ i +1]) , 
i, i+1); 
pl = M. apply (pi); 

T = M*T; 

} 

// rotate c2 into X_n—1 — X_n — Plane 
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Coordinates p2 = c2; 
p2 = T.apply(p2); 

for (uint i=0; i<dim—2;i++) 

{ 

Isometry cCoordinates , Hyperbolic_Disk_Space<Coordinates> >M 

= Isometry cCoordinates , Hyperbolic_Disk_Space cCoordinates > >:: Pure_Main_Rotation (atan2 (p2 [ i ] ,p2 [ i +1 ]), 

i / i+1); 

p2 = M. apply (p2); 

T = M*T; 

} 

// now move point using Translation in 2D Poincare Disk 
double dist = metrics (pi ,p2); 

// need implementation of translations in isometry group, will use workaround instead by doing it manually 
// Isometry cCoordinates , Hyperbolic_Disk_SpacecCoordinates> >M = Isometry cCoordinates , 

Hyperbolic_Disk_SpacecCoordinates> >:: Pure_Translation (pi ); 

// T= M*T; 

Complex ppl (pi [dim —2],pl [dim —1]); 

Complex pp2(p2[dim —2],p2[dim —1]); 

pp2 = (pp2 - ppl)/((conj(ppl»(-l.))*pp2)+l.); 

double dist_PD = abs(pp2); 

// and now let's rescale 

double new_dist_PD = sqrt ((cosh ( dist *param) — 1)/(cosh( dist *param) +1 )); 
double ratio = new_dist_PD/dist_PD; 

pp2 *= ratio ; 

// now rewind all trafos 

Pp2 = (pp2+ppl)/(conj(ppl)*pp2+l.); 
p2. set (dim—2,real (pp2)); 
p2. set (dim—l,imag(pp2)); 

Coordinates result = T. inverse (). apply (p2); 
return result; 


LISTING 12: space_factory.cpp - geodesic function ofHyperbolic_Disk_Space<Coordinates>) 


^Precision algebra/complex [prec_algebra/complex.{h,cpp}] 

In the shown implementations of the chosen specializations of the isometries, sev¬ 
eral structures can been seen that we have not introduced yet. They present two 
different ways to approach the problems with numerical errors. 

The bundle consisting of P_Float,P_Coordinates and P_Matrix serves two pur¬ 
poses. The first one is to provide matrix and vector calculations. The second is to 
estimate the numerical error of its instances due to the finite precision of the nested 
representation of floats and the error propagation when using these error-prone 
representations in calculations. They do not improve the precision, but they pro¬ 
vide at least a way to get the resulting error margins to take them into consideration 
when using the representations. 

The second approach is actually to substitute the floating point numbers with data 
type with much higher precision. This is of great importance when we are going 
to generate tessellations of the Poincare model as we will have to decide there, if 
two representations denote the same point. Near the origin we could still use the 
approach of estimating the error, but for nodes, generated in a larger distance to the 
origin, the Euclidean distance between two neighboring nodes tends to zero while 
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the error increases since more and more Isometries are combined to generate these 
points. We therefore quickly reach the limit where the error exceeds the quarter of 
the distance and therefore makes it impossible to decide if a new node already ex¬ 
ists or not. To tackle this problem, we made use of two external C-libraries ([RinOl] 
and/or [Pri]) that already implemented high-precision numbers, so we have had to 
write a wrapper class for their functions to use them in our project. 


5.2.2 ^Generating tessellations [t e s s el at i on. {h, cpp}] 

When discussing tessellations in chapter 3, we have seen, that any kind of regu¬ 
lar tiling can be generated easily by using one of its symmetry groups. This can 
be transferred one-to-one to an implementation. We will first create the symme¬ 
try group G = • • • } as proposed which is the group of involutions in the 

side centers. The following listing shows as an example the function to create the 
symmetry group, or rather, the generators for the Euclidean Cartesian Coordinates 
model. 
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const Tesselation cComplex, Euclidean_Cart_Coord_Space<Complex> >:: Discrete_Symmetry_Group 
Tesselation <Complex, Euclidean_Cart_Coord_Space<Complex> > 

:: create_Tess_Group (const double& NN_dist, const uint schlaefli_p , const uint schlaefli_q) 

{ 

assert (NN_dist >0); 

Discrete_Symmetry_Group result; 

assert( // 1./ schlaefli_p + 1./ schlaefli_q == 0.5 

(schlaefli_p == 4 && schlaefli_q == 4) // cubic 
II (schlaefli_p == 3 && schlaefli_q == 6) // triangular 
II (schlaefli_p == 6 && schlaefli_q == 3) // hexagonal 
); // otherwise invalid 

Isometry<Complex, Euclidean_Cart_Coord_Space<Complex> > trans 

= Isometry <Complex, Euclidean_Cart_Coord_Space<Complex> >:: Pure_Translation (NN_dist/2); 
Isometry<Complex, Euclidean_Cart_Coord_Space<Complex> > invol 

= Isometry <Complex, Euclidean_Cart_Coord_Space<Complex> >:: Pure_Rotation (M_PI); 

for (uint i=0; i< schlaefli_q; i++) 

{ 

double angle = ( double (i)/schlaefli_q ) *2*M_PI; 

Isometry<Complex, Euclidean_Cart_Coord_Space<Complex> > rot 

= Isometry <Complex, Euclidean_Cart_Coord_Space<Complex> >:: Pure_Rotation (angle); 
result. push_back ( rot. inverse () * trans . inverse () * invol *trans*rot); 

} 


return result; 

28 | ) 


LISTING 13: space_factory.cpp - creating symmetry group 
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According to ch.3 we can expand each node using the respective symmetry group. 
The following listing of the implementation shows the generation of the first ring or 
layer of nodes which is therefore the expansion of the origin (the expression "ring" 
originates from using triangular tailings, where each layer is ringlike connected. 
This holds not for arbitrary (regular) tesselations.) 
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export template<class PointType, class SpaceType> 

vector <PointType> Tesselation_Factory <PointType , SpaceType> 

:: create_regular_Tesselation (const Discrete_Symmetry_Group& sym_grp, 

const function< bool(const PointType&, const double&) >& acceptance, 
const uint& _max_level, const uint& _min_level, 
const PointType& origin) 

{ 

vector <PointType> generated_points; 
vector<Node> nodes_last_level; 
vector <Node> nodes_current_level; 
vector <Node> nodes_next_level; 

// 1st level will generated by Symmetry_group starting at the origin 










5.2 Auxiliary Library [libgsom aux. {a, so}] 


63 


for(typename Discrete_Symmetry_Group :: const_iterator iter= sym_grp. begin () ; iter != sym_grp . end () ; iter++) 

{ 

Symmetry_Trafo current_trafo = *iter; 
double current_error; 

PointType current_point = current_trafo . apply (origin , &current_error ); 
if (acceptance(current_point, current_error)) 

nodes_next_level. push_back (Node( current_trafo , current_point , current_error)); 

} 

LISTING 14: tesselation_template.cpp - creating first layer 
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As, in general, there exists many pathes in the tiling connecting the origin and v by 
following the edges and each of this path represents one of the possible sequences 
such that 3.1 holds, it is important to consider that the sequence is not unique for 
each point. That does not mean, by the way, that the compositions corresponding 
to these sequences are equal. It only implies that they act equally on the origin by 
mapping it to v. We have to consider this in order to avoid creating a vertex twice or 
even more times. Since each layer is fully expanded before we start to expand the 
first vertex of the following one, there are only three possibilities for these double 
creations to take place (assumed there are still none in the already expanded layers). 
Regarding an expansion of a vertex of the n-th layer: 

• newly created vertex lies in the (ft — l)-th layer 

• newly created vertex lies in the ft-th layer 


Each layer can now be obtained by expanding each node of the preceding one. 


| uint level = 2; 

uint max_level = _max_level; 

// if max_level is zero then use max level of implementation 
if (max_level==0) max_level = MAX_LEVEL; 

while (nodes_next_level. size () && level <= max_level) 

{ 

//cout « " Tesselation: generating level " « level « " 

// "old" last level not used any more, points are transferred to generated points if level before last 
level >= _min_level 
if (level >= _min_level+3) 

for ( typename vector <Node>:: iterator it_nodes = nodes_last_level. begin () ; 
it_nodes < nodes_last_level.end() ; it_nodes++) 
generated_points . push_back(it_nodes —>point); 

nodes_last_level = nodes_current_level; 
nodes_current_level = nodes_next_level; 
nodes_next_level. clear () ; 

for (typename vector <Node>:: iter ator it_nodes = nodes_current_level. begin () ; it_nodes < 
nodes_current_level.end() ; it_nodes++) 

( 

for (typename Discrete_Symmetry_Group :: const_iterator it_sym = sym_grp. begin (); it_sym != sym_grp.end 
() ; it_sym++) 

{ 

Symmetry_Trafo current_trafo = it_nodes —>trafo * (*it_sym) ; 
double current_error; 

PointType current_point = current_trafo . apply (origin , &current_error); 

// check if accepted, i.e. e.g. uncertainty of generated point due to rounding error is small 
enough 

// compared e.g. to the expected distance between to nodes in this neighborhood 
// or if generated point is still in the volume that shall be tesselated . 
if ( acceptance (current_point , current_error )) 

nodes_next_level. push_back (Node( current_trafo , current_point , current_error )); 



// cout « "Node totally expanded"<<endl; 

} 

level++; 

// cout « nodes_next_level. size () « " Points generated" « endl; 


LISTING 15: tesselation_template.cpp - creating layers recursively 
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• newly created vertex lies in the (n + l)-th layer 

It cannot occur in lower layers since these have been already expanded and, thus, 
also their children vertices which lie at least a layer below the n-th one. Therefore 
the vertex that we are actually expanding would have to be a twin of a node in a 
layer below which contradicts our assumption that this has not occurred yet. And 
it cannot take place in higher layers since this would obviously be a contradiction 
to the definition of the layers. Thus it is sufficient to check for each new vertex if it 
already belongs to one of the three layers. In the implementation, this is performed 
by the following code: 
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The truncation has been realized by, on the one hand, allowing to specify the max¬ 
imal level of layers, and on the other hand, including an acceptance function, that 
allows to explicitly decide which node should be accepted in the map and which 
should be rejected due to e.g. a too high errors in the generation process or, in the 
case of the truncation in respect to given boundaries (as used to create e.g. the rect¬ 
angular boundaries for the Euclidean maps), due to its position. Thus, in the case 
that we want to truncate in respect to the expansion level we simply pass this level 
when calling the generator and when we want to truncate in respect to boundaries, 
we pass a function object, that returns false for each passed node that lies outside 
these wanted boundaries and true otherwise (consideration of other reasons of re¬ 
jection neglected). The maximal number of layers for the generation process has 
thereby to be set to infinity (The termination of the algorithm is then only ensured 
by the acceptance function). 


export template<class PointType, class SpaceType> 
vector <PointType> Tesselation_Factory <PointType , SpaceType> 

:: create_regular_Tesselation (const Discrete_Symmetry_Group& sym_grp, 

const function< bool(const PointType&, const double&) >& acceptance, 
const uint& _max_level, const uint& _min_level, 
const PointType& origin) 

{ 

vector <PointType> generated_points; 
vector<Node> nodes_last_level; 
vector <Node> nodes_current_level; 
vector <Node> nodes_next_level; 

// 1st level will generated by Symmetry_group starting at the origin 

for(typename Discrete_Symmetry_Group :: const_iterator iter= sym_grp . begin () ; iter != sym_grp. end (); iter++) 
{ 

Symmetry_Trafo current_trafo = *iter; 
double current_error; 

PointType current_point = current_trafo . apply ( origin , &current_error ); 
if (acceptance (current_point, current_error )) 

nodes_next_level. push_back(Node( current_trafo , current_point , current_error )); 


Listing 16: tesselation_template.cpp - implementation of the tessellation algorithm 


5.2.3 Generating sample sets [stochastics. {h, cpp}] 

For the SOM configuration that we want to analyze numerically we had to imple¬ 
ment all the distribution that we discussed in detail in Ch.4. Besides the interface 
class Distribution, which works as a base class for all the distributions, the al¬ 
gorithm presented in Ch.4.2 could be transfered one-to-one to the implementation. 
TheUniform_Hyperbolic_Disk_Distribution and Hyperbolic_Disk_Euclidean_Cart 
Coord_Distribution thereby use theUnif orm_Hyperbolic_Spherical_Distribution 
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and Uniform _Euclidean_Cart_Coord_Distribution as generators the correspond¬ 
ing random sample set. The class diagram in Fig/43 in the appendix shows these 
aggregations. 


5.2.4 SOM Factory [som_f actory. *] 

The SOM factory now uses all the structures that had been defined in the preced¬ 
ing sections to finally ensemble the SOMs for the numerical simulations. Similar 
to many generic classes above SOM_Factory is thereby specialized for any needed 
combination of map and feature space. These specializations on their part provide 
functions to create all the variations of SOMs with the given spaces that we want to 
analyze. Since a listing of this implementation is not very instructive the interested 
reader searching for the concrete implementation may be therefore referred to the 
API. 


5.2.5 Evaluation methods and tools 

The evaluation process, that has been used to analyze the stability features of the 
different som configurations, has been realized by dividing the problem into to sub¬ 
problems. On the one hand we needed an automatization of the "training" of the 
SOM for a given set of parameters and sample distributions and, on the other side, 
we needed methods to analyze the SOM using various measurements. We will be¬ 
gin with having a look at the latter requirement. 


Analyze methods & measurement [evaluation. fh,cpp}] 

As mentioned above, we need tools to analyze the SOM e.g. compute e.g. the Fourier 
transformation of its feature space or the mean value of each neuron taking multi¬ 
ple results of adaption steps into account. To meet this need, an abstract Analyzer 
class has been defined (cf. listing 17) and then a Fourier and a mean analyzer have 
been implemented. 
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Since these tools should work on our generalized maps, they can only use the prop¬ 
erties and structures that are inherent to the spaces of the particular SOM. For ex¬ 
ample does the mean analyzer need to work on at least a geometric feature space 


| templatecclass MapPointType, class FeatPointType> 
class Analyzer 
{ 

public : 

virtual void add_sample( const FeatPointType& last_sample , const double* last_adapt_params , 
const FeatPointType& last_winning_pos , 

const vector<Neuron<MapPointType,FeatPointType> >& result_som_state) =0; 

virtual void reset()=0; 

virtual void registers ( const Analyzable_SOM<MapPointType, FeatPointType >* parent, 

const vector<Neuron<MapPointType, FeatPointType> >& initial_som_state 
// = vector <Neuron<MapPointType, FeatPointType > >() 

) throw (Insufficient_Space_Exception , Max_Reg_Exception) =0; 

/*! Function can be used e.g. to inform registrators if Analyzer is destroyed */ 

virtual void unregisters ( const Analyzable_SOM<MapPointType,FeatPointType>* parent)=0; 


LISTING 17: evaluation.]"! - Analyzer 
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to calculate the mean point of the sequence of results. Since this has been already a 
prerequisite of the SOM itself, it is always fulfilled. By contrast, this is not enough 
for the Fourier analyzer which has to work on a R-Vectorspace. It is obvious that 
there are geometric feature spaces that have not this needed structure (e.g. the mod¬ 
els used for the hyperbolic space) and a Fourier analyzer would not be able to work 
on them. So, it has to be possible for the analyzers to get access to the spaces of the 
SOM and in return the SOM should be able to reject analyzers that would be able 
to work on it. These problems have been solved by extending the SQM class into the 
Analyzable SOM class that is shown below: 
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LISTING 18: evaluation.!! - Analyzable SOM 

It offers the possibility to add analyzers directly to the SOM without having to re¬ 
gard any of the prerequisites that have been discussed above. By adding an ana¬ 
lyzer, the SOM itself will then provide both the map and the feature space for it. 
It will furthermore pass a copy of its internal state to allow the analyzer to use it 
as a reference or a first sample. In return the analyzer tests if the map and feature 
spaces meet its needs and accepts or rejects them, respectively. The rejection will be 
signaled by throwing an Insuff icient_Space_Exception. If successfully added. 


template<class MapPointType, class FeatPointType> 

class Analyzable_SOM : public SOlVkMapPointType, FeatPointType> { 

public : 

typedef Neuron<MapPointType, FeatPointType> neuron_type; 

// no pointers in vector to prevent illegal duplicates of neurons 
typedef vector<neuron_type> som_net_type; 

typedef function <double ( const double&, const double&)> neighborhood_function; 
typedef Metric_Space <MapPointType> map_space_type; 
typedef Geometric_Space<FeatPointType> feat_space_type; 

Analyzable_SOM(som_net_type &neurons, 

const map_space_type& map_space, 
const feat_space_type& feat_space , 
const neighborhood_function& h 
): 

Analyzable_SOM(som_net_type &neurons, 

const typename metric <MapPointType >:: f& dist_map , 
const typename metric<FeatPointType >:: f& dist_feat , 

const typename geodesic_line_segment<FeatPointType >:: f& get_point_at_geodesic_feat_line , 
const neighborhood_function& h 
); 

virtual ~ Analyzable_SOM() ; 

virtual Neuron<MapPointType, FeatPointType> adapt (const FeatPointType& input, 

const double& epsilon , 
const double& sigma); 

void add_Analyzer(Analyzer<MapPointType,FeatPointType>* device, 
const uint& measure_interval = 1) 
throw (Insufficient_Space_Exception , Max_Reg_Exception); 
void remove_Analyzer (Analyzer<MapPointType, FeatPointType>* device); 
void remove_All_Analyzers () ; 

const map_space_type& get_Map_Space() const; 
const feat_space_type& get_Feat_Space () const; 

private : 

struct analyzer_registration { 

Analyzer <MapPointType, FeatPointType >* analyzer ; 
unsigned long registered_since; 
int measure_interval; 

}; 

typedef vector <analyzer_registration > reg_list; 
reg_list registered_Analyzers; 

unsigned long sample_counter; 
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the SOM will then automatically call the add_sample method of the analyzer after a 
predefined amount of adaption steps (the size of this interval has been passed to the 
SOM when the analyzer has been added). This will be repeated until the analyzer 
has been removed from the SOM. In general, a SOM can handle many analyzers 
at the same time. It would also be possible to design an analyzer that can exam¬ 
ine several SOM simultaneously. But in the case that the SOM or the analyzer have 
reached their maximal limit, this will be signaled by the Max_Reg_Exception. 

The so defined combination of extended SOM and analyzers thus allows to exam¬ 
ine specific attributes. But, for our goal to find the stability limits in the parameter 
space, we have to define a kind of error function i.e. a mapping of a state or at¬ 
tribute of the SOM belonging to a given parameter set on a positive real number. 
This abstract class is defined as following: 
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The naive approach would be to measure the distance between the positions of the 
neurons in a reference state and in the actual state and calculate the error depending 
on this results. A problem thereby is, that the feature spaces may have many sym¬ 
metries. In the Euclidean and hyperbolic models, these are certain translation and 
rotation invariances. Without breaking these symmetries (e.g. by using a (finite) 
bounded subspace or fixing a few neurons), the representation of map can rotate 
and in the case of a virtual unbounded space by using periodic boundary conditions 
even translate freely. That means, that there exists a whole continuum of equivalent 
som states. The calculated error should therefore depend only on the equivalence 
classes. That would be obviously not fulfilled by the naive approach above. An im¬ 
plementation of this Error_Analyzer is the Mean_Extra_Dim_Analyzer. It extends 
the mean analyzer. It calculates the error by taking the mean som state and com¬ 
puting in the feature space the distance of each neuron to a corresponding neuron 
in a given reference som by regarding only a particular dimension. This is done by 
projecting both the mean neuron and the reference neuron onto a given plane and 
calculating the distance of the neurons and their projection. The error is then the 
sum or difference of these distances depending if both are on the same or on oppo¬ 
site sites of the projection plane. Thus this analyzer needs a feature space which has 
a projection mapping defined on it. We already briefly discussed in Ch.5.2.1 how 
our models of interest realize this projections and thus meet the requirement of this 
analyzer. 

Although we won't discuss this in detail, a few more analyzers and error analyzers 
had been implemented like the already mentioned Fourier analyzer and an addi¬ 
tional error analyzer that computes the mean representation error of the neurons in 
respect to the presented sample set. In general, any (error) analyzer, which depends 
only on the available data that is passed by the Analyzable SOM (i.e. the current 
som state, the currently presented sample and the position of the representative of 
the winner neuron in the feature space), can be implemented. 


| templatecclass MapPointType, class FeatPointType> | 

| class Error_Analyzer: public virtual Analyzer<MapPointType, FeatPointType> | 

I public: I 

| virtual double get_error()= 0; | 

L 1 :_J 

LISTING 19: evaluation.h - Error Analyzer 
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* Automatization [test_suite. fh,cpp}] 

At the beginning, we tried to implement an automatization that should place us in 
a position to run the stability analysis totally unsupervised, but the first attempts 
failed as sometimes the increasing fluctuations near the stability limit were by mis¬ 
take identified as the limit itself and therefore the further sweeps of the search were 
performed at the wrong positions. We therefore decided to implement at least a set 
of functions that ease a supervised analysis. The class Test provides two simple 
functions that take a SOM, a generator for samples, a function that maps the cur¬ 
rent index of the adaption step to a parameter set for the adaption allowing to vary 
£ and J throughout the simulations. Then they run the adaption procedure on the 
given SOM for a given number of adaption steps. 

Since we want to analyze the behavior of a given SOM configuration in a particular 
one-dimensional parameter space, namely the size s of the extra dimensions, we 
furthermore implemented the class Stability_Test that inherits Test. It provides 
a function that automatizes the search at least for one sweep as it takes a discrete set 
of values of s, a set of the corresponding generators for the samples and addition¬ 
ally a list of error analyzers. The adaption process will then be performed for each s 
and the error values, computed by the single analyzers, and the initial som state are 
stored in separate output files. If furthermore an error analyzer is also a mean ana¬ 
lyzer (as it is the case in our simulations as we use the Mean_Extra_Dim_Analyzer) 
the mean states of each search steps are also saved to allow further studies and 
visualizations of the results later on. 


5.3 Testers [tester_*. cc] 

Besides the two libraries the written software package comes with several tester 
programs. Most of them just have the purpose to test certain parts of the library. 
For example generates and visualize tester_stochastics any of the pre-defined 
random distributions and tester_tesselation any of the regular tessellations 12 . 

But the most important for our purposes is main_test. cc. In its compiled form 
it is the program we use for the numerical analysis. It takes several parameters that 
completely define the SOM configuration as well as the testing process 13 . Given 
the specific parameters, it automatically generates the according GRiSOM and dis¬ 
tribution as well as a Mean_Extra_Dim_Analyzer and a Mean_Rep_Error_Analyzer. 
Then it passes them to the sweep function of the Test_Suite. Furthermore a new 
subfolder of folder data/ in the working directory is created and the result will be 
written there for future analysis. 


12 The visualization requires an installed gnuplot 

13 the specific syntax for the use is given Ch.B.3.1 in the appendix. 
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Chapter 6 

Stability Analysis of SOM / Analytic 
approach 


As mentioned in the motivation of this thesis we will introduce and extend two ap¬ 
proaches formerly done by H. Ritter and K. Schulten in [RS88]. The first one is the 
analytical approach, which we will present in this chapter by analyzing the dynam¬ 
ics of the (generalized) SOM and following closely the calculations of the paper. As 
an result we will obtain the Fokker-Planck Equation of this stochastic process. This 
result can then be used to determine analytically the stability under the special cir¬ 
cumstance of having spatially uniform sample densities. 

The second approach that will be used is a numerical one, i.e. we will make use 
of Monte Carlo simulations to search the parameter space for the limits of stability. 
It will be used to confirm the analytical results on the one hand and to compute 
results beyond these cases (e.g. for hyperbolic space) on the other hand (cf. chap. 7). 

To keep this chapter as brief as possible, all longish, lesser instructive intermedi¬ 
ate steps of the calculations are skipped. This includes also the geometric calcula¬ 
tions needed for the concrete analysis of the regular maps. For the interested reader 
they are put in the appendix to allow to quickly check the used results in detail. 
Nonetheless are the concrete calculations of the stability limits kept in this chapter, 
but have been partly shortened. The reader in a hurry, who is not interested in the 
calculations at all, may be referred to the last section, where all the results are briefly 
summarized. 


6.1 Definition of Stability 

At the beginning it should be clarified what is meant when we talk about "stability 
of a SOM" in the context of this thesis. As mentioned in chapter 1, the SOM face a 
dimension conflict if the number of the intrinsic dimensions of the sample set in the 
feature space exceed the dimension of the map space. The SOM, trying to detect the 
most significant dimensions, embed the map as best as possible in the given set of 
samples and therefore in the additional dimension, if the inputs are scattered deep 
enough, to minimize representation error as best as possible. But an additional di¬ 
mension in e.g. sensory data, used as input, may not always result from interesting 
structures of an examined object, but instead from rather unwanted data like back¬ 
ground noise. So, we are eager to know, how large the scattering of this noise can 
be, such that the SOM still regards the additional dimension as insignificant. 
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To approach this question, we examine the behavior of a SOM with a two-dimensional 
map working in a three-dimensional feature space. The set of the input vectors is 
now bounded by the faces of a flat cuboid. The significant structure lies in the two 
directions, in which the cuboid is wider (x- and (/-direction) , and the remaining 
(z-)direction is reserved for the noise. The case in which the cuboid is totally flat, 
thereby corresponds to the situation where the number of intrinsic dimensions of 
the input exactly matches the number of dimensions of the SOM, while it exceeds 
it by one dimension otherwise. 

A straightforward definition of stable is therefore, that the SOM despite the addi¬ 
tional dimension in the feature space rest in its former equilibrium state of the flat 
cuboid. It may show, nevertheless, small distortions due to equilibrium fluctua¬ 
tions. In regard to our question above, we are thus interested in determining the 
limit of the edge length 2s* of cuboid in the z-direction up to which the equilibrium 
remains a stable one. In this context, a certain SOM is then regarded as being "more 
stable" as another one, if the equilibrium remains stable for larger s*. 

Fig.21 shows a HSOM that is on the left side still in the starting equilibrium state. 


HSOM (3,7)-map (3 layers) 
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FIGURE 21: HSOM with (3,7)-4 map (left) below stability limit (s=0.1) and (right) beneath (s=0.7) 

In the figures on the right side, the size of the scattering of the input towards the 
additional dimension already exceeds the stability limit. Having a SOM state with 
the representants only lying in or near the old, "flat" equilibrium would yield then 
a very poor representation for many samples. Therefore, the SOM leaves its former 
state and is drawn into a new one to embed itself better into the given set of samples 
and thereby minimizing the representation error. This can be easily observed by 
looking for significant lasting distortions in direction of the additional dimensions. 
This criterion will be used to determine the limit numerically in section 7, but for 
the analytic study following below, we are instead having a look at the dynamics of 
the SOM, or to be more exact, at the direction of the "drift" of the map. 
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6.2 *Fokker-Planck Equation for the SOM process 


We start our analytic approach by examining the dynamics of the "classical" SOM. 
It defines a learning system, i.e. a system that will reach its goal by a sequence of 
adaption steps defined by a learning algorithm. Each step results from the presenta¬ 
tion of a input vector/sample v and the interaction of each neuron with it (or with 
the cooperation with the winner neuron). This is represented by the transforma¬ 
tion 


zv = T{zv',v,s) (6.1) 

where zv' denote the state of the SOM before and w after the transformation. 

In general, a sample can be any observable in the environment like the measure¬ 
ment of an experiment, which are in most cases hardly predictable. However given 
a stationary environment, it is sensible to idealize the system by assuming that there 
is a stationary probability distribution P(v) for the samples. Eq.6.1 does then not de¬ 
fine a deterministic but a stochastic Markov process where the state of the SOM is also 
only defined by a non-stationary distribution function S(zv, t) of the states where t 
is the iteration time. Knowing this distribution at a certain time step, it is then possi¬ 
ble to compute S(w, t + 1) using the transition probability in the next adaption step 
to get state zv' when starting in zv. 


Q(zv,w) = Y2 dvS(zv — T(w‘,v,e))P(v) (6.2) 

r ■'b(Wr) 

v -v-' 

I dv 


In order to analyze the dynamics of the stochastic process of the SOM we shall fo¬ 
cus on how to derive the time development of the distribution function of the SOM. 
We will further assume, that the map is already in the vicinity of its equilibrium i.e. 
stationary state and that the step size £ is therefore sufficient small, i.e. we want to 
investigate the system only in the limit of small £. A mathematical approach is, to 
adapt the Fokker-Planck equation, which describes the time evolution of probability 
functions. It was first used as a statistical description of Brownian motion of a par¬ 
ticle in a fluid, but has been generalized to other observables as well. In our case, 
the respective observable will be the state of the map. 


Let us begin with a given array A of neurons , labeled by their discrete positions 
r in the map space and let zv = (w - f] ,..., zv,: N ) denote a state of them. Given that, 
we can then determine the Dirichlet or Voronoi tesselation { F ?| ,... F ? n } of the feature 
space V in respect to A by 


,(zv) = <ju e v\ 


v - zv r . 
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(6.3) 


We will call then a tile F? (zv) the feature set or voronoi cell of neuron r,-. Then the 
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probability for a sample to belong to P 7 . (zv) is given by 

Pr ; (re) := f dvP(v) (6.4) 

Jf 7 . (w) 

The expectation value of the samples restricted to the voronoi cell i.e. the centroid is 
then defined by 
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r i 
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p 7 i {w) 



(6.5) 


According to the update rule in eq. 1.1, for each selection of an input vector v the 
Transformation T of the current state of the neurons zv is defined by: 


zv 7 . = (1 - e%)4. + £% 


( 6 . 6 ) 


where s is the winner neuron, i.e. v E Fg(zv'). As mentioned in the introduction 
above, we are now interested in the dynamics of the SOM (in the limit of small 
step sizes e), i.e. the evolution of zv, or to be exact, S(zv, t) in the time. Since we 
have a Markovian process, we will make use of the Chapman-Kolmogorov equation 
(cf.e.g.[Kam87]). Inserting the transition probability (Eq.6.2) we get 


S(o?,f + l) = f d N zv'Q(zv,zv')S(zv',t) = f ^ N zv f dvP(v)S(zv — T(zv',v, e))S(zv', t) 
J r J J F r (w r ) 


(6.7) 


which is the distribution of the states at the iteration time t. 

If we are in or rather very close to a stationary state, the probability to leave it 
should tend to be zero i.e. S(w, t ) is peaked around a zv which is chosen such that 


J dvP(v)T(zv,v,e) — zv = 0 


( 6 . 8 ) 


We can then substitute the distribution function by the distribution S(u,t ) of the 
deviations u from the stationary expectation value zv. 

S(u,t) := S(w + u,t) (6.9) 


After another few, barely instructive transformations, our version of the Fokker- 
Planck Equation is then finally obtained: 


\d t S(u,t) = Yjfmr' 


3_ 
n d u 7 „ 


\ 


dVn 
dzv -,# 


-(re) u 7 i n S{u, t ) 


\ ’ D rmr , n 

i £ ( 7 jO d 2 s(u,t) 

1“ 2 Ltrmr'n Ly rmr'n\ w ) du, m du-,„ 




( 6 . 10 ) 


where B and D are two matrices expressing the drift of the neurons in the feature 
space and the correlation of the drifts i.e. kind of diffusion, respectively. 
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Given that equation we can now derive explicit expressions for the expectation 
value and the correlation matrix of the deviations u? m namely Uf m (t) and C^ m ^ s (t) at 
any given point of time t: 


u(t) 

C(t ) 


Y(f)«(0) 


y(f) 


C(0) + [ £(t) 2 Y(t) 1 D(Y(t) 1 ) T dz 
Jo 


y (o 1 


( 6 . 11 ) 

( 6 . 12 ) 


where Y (f) a matrix given by 

Y(f)=exp(— B j e(r)dr) 

A special case for Eq.6.11 which we will need in the analysis of the regular maps, 
arises, when we assume that B and D commutes and e is constant. Then the equa¬ 
tion can be simplified and we get: 

C = ({{Urm ~ Urm){Usn ~ %n))) = e( B + B T )~ 1 D (6.13) 

6.3 * Analysis for Euclidean space with regular maps 


We are now going to determine the stability limit for the three possible regular Eu¬ 
clidean maps which we have found in chapter 3. Analogously to the approach in 
[RS88] we shall thereby have a closer look at the case of spatially uniform prob¬ 
ability densities P(v) of the samples that we present to the SOM. Since the paper 
already discussed the case of having a square map, we will mainly analyze here 
the other two regular maps i.e. the triangular and hexagonal tiling and only extend 
the square case to cover the VQ neighborhood function and the Gaussian neighbor¬ 
hood function in the limit of small a. 

We therefore assume that we have a three-dimensional parallelepiped V bounded 
by 


0 < 
0 < 

0 < 


x,\j < N (square) 

x < -N ,0 < i/ < —N 

- 2 - J - 2 

a/3 

x<N,0<y< 7^-N 


(hexagonal) 

(triangular) 


and —s < z < s where x,y and z are cartesian coordinates of the Euclidean space 
in which V is embedded. We then define a uniform distribution restricted to V and 
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thereby given by 

P(v) = [2sN 2 ] -1 (square) 
P(v) = [^-^sN 2 ]^ 1 (hexagonal) 

,/3 

P(v) = [-^-sN 2 ] -1 (triangular) 


The map is a two-dimensional N x N array (cf. section 3) of the particular form 
i.e. square, triangular or hexagonal. To avoid edge effects we furthermore impose 
periodic boundary conditions along the x- and ^-directions in both map and fea¬ 
ture space. Thus, obviously, an equilibrium state for the particular cases using the 
sample distribution as defined above is given by 


w r = me i + ne 2 


w r = 


w r = 


Vs _ , 
—ne 2 + 


Vs 


-ne 2 


(square) 


me i 

if m and n even 

(m + \)e\ 

if m even and n odd 

(m + 2)e\ 

if m odd and n even 

(m + \)ei 

else 

me i 

(m + \)e\ 

if n even , . 

else (tnan| 


(hexagonal) 


since it fulfills the condition of an equilibrium as in Eq.6.8: 

. J dvP(v)[T(ib,v,e] r — w r = 0 (6.14) 

We should remind us that, as already noted in section 5.2.5, this equilibrium state is 
not unique but represents a whole class of states closed under the symmetry trans¬ 
formations which are the rotations and translations in the jc-y-plane. This particular 
choice of the representant here is just for the purpose to get a convenient selection 
of origin and orientation of the coordinate system. 

Reminding that we decided to work here with small and constant £ and if a dis¬ 
tribution S(u,t ) with finite variance and expectation value w exists, we can calcu¬ 
late the fluctuations about this stable equilibrium state by using Eq.6.13. S(u) : = 
lim^oo S(m, t) thereby denotes the corresponding stationary distribution function 
of the deviations in the equilibrium. 


One difficulty concerning the Fokker-Planck equation is that the deviations u? are 
coupled. Fortunately, Dy m ^ n and Brmr'n are translational invariant i.e. depend only 
on the difference r — r' and on m,n. Thus, we can easily switch to the Fourier space 
by Fourier transforming the deviations: 



1 

N 
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As each mode amplitude is distributed independently, we can decouple the dis¬ 
tribution function and thus finally get a mutually independent stationary Fokker- 
Planck equations for each of the so-obtained individual mode distributions Sp 


EH*), 


1 dll 


9 u m S^(u) + 


9 2 


l du m du n k 


Sr(u) = 0 


with 


(6.15) 


D{k) 

B(k) 

where 


Ee*^D„ = i 


N 2 


1 - l&a® 

1 ,( 0 ) 


[(Vjfi(*))(Vgfi(*)) T + M|S©| 2 ] 
- ^(iv E 6(S))6(*) T 


M = — 
2s 


Fy(w) 


dv(vv T 


is the covariance matrix of the input vectors v over a feature set Fy(iv), h(k) is the 
Fourier transform of the (translation-invariant) neighborhood function hyg and d(k) 
and b(k ) are the Fourier transform of the shift of the centroid of the voronoi cells 
Fj? for small deviations of a node Wy in the feature space, given by the matrix a- r -y, 
and the corresponding relative change in the volume and therefore probability of 
the cells, respectively The concrete matrices are hereby derived in the appendix A.3 
where we compute them in detail for the various types of maps. 


To simplify the further consideration, we postulate that in the case of regular maps 
B and D will have a diagonal block structure i.e. the components with index (3, i) 
and (/, 3) vanish except for i = 3. Thus we can easily determine the third eigen¬ 
vector of both matrices. Furthermore do we postulate that the matrices commute 14 
(We will check both assumptions when we deal with particular maps below). The 
latter property now allows us to meet the prerequisites of the simplification done 
in Eq.6.13 and can make use of it even if we have to deal here with the Fourier 
transformed version. The only detail, we have to modify, is, that instead of the 
transposed matrices we have to use the transposed and conjugate-complex counter¬ 
parts. We thus get 




gg) 

25R(A«(fc)) 


An instability in the z-direction arises where 


(«3(£) 2 ) 


eP(k)33 

2K(B(k) 33 ) 


(6.16) 


14 for the Gaussian neighborhood function at least for the approximation in the limit of large a 
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exhibit a singularity i.e. the limit where the denominator becomes negative for cer¬ 
tain choices of k and s. This corresponds to the point at which the equilibrium de¬ 
fined by our differential equation 6.15 is no longer attractive as B is no longer posi¬ 
tive definite. The smallest s where this occurs is then the stability limit, denoted by 

s*. 

As we are now only interested in the fluctuation along the z-direction, it is hereby 
sufficient to consider only the following components of B,D and M instead of the 
whole matrices: 

t>330) = 


B33 (k) 


m 

N 2 


„ h(k )„ 
h( 0) v ; 




(6.17) 


As in or near the equilibrium the distributions of the z-components for the input 
vectors in each voronoi cell are not correlated to the other components and the 
voronoi cells themselves are all identical, all but one values in the third column and 
row vanish. M 33 is the only non-vanishing one and is equal to the variance of the 
uniform distribution which is given by 

M 33 = ^(2s) 2 = ^ (6.18) 

Thus, we have finally derived everything we need for the following analytical anal¬ 
ysis. 


6.3.1 ^Square (4,4)-map 

We want to extend first the result of [RS88] by studying the ultra-short ranged 
neighborhood function i.e. explicit vector quantization and the limit of cr —> 0 for 
the Gaussian neighborhood function. 


Long-ranged, Gaussian neighborhood function 


We start with the Gaussian neighborhood function as it has already been analyzed 
in the paper for a similar case (i.e. for r>l) and we can use some intermediate re¬ 
sults. D and B (cf.[RS88, Eq.64-69]) satisfy the block-matrix condition and both com¬ 
mute in the limit of small <j since we can then approximate D ~ ^P^Mexp(— k 2 a 2 ) 
and, in addition, £>12 and Lh] vanish in respect to the other components of B (B and 
D in this approximation are then both diagonal). We thus obtain for the fluctuation 
of the eigenmode amplitude of ii 3 for small sigma: 


(m 3 (A:) 2 ) = £7U7 2 


s exi 


p(— k 2 cr 2 ) 


3 — 2s 2 (2 — coski — cos^) 
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As mentioned above we now have to find the smallest s for which the denominator 
becomes negative i.e. has a null: 


3 — 2s 2 (2 — cos *4 — COS/C 2 ) = 0 — > s(k) = * -— -----— 

v ; w y4-2cosfci-2cosic 2 

Since sup^(4 — 2 cos k\ — 2 cos 1(2) = 8, the stability limit s* is given by 



(6.19) 


Ultra-short-ranged VQ neighborhood function 


Next we will take a look at the explicit VQ neighborhood function, which was de¬ 
fined in Eq.1.4. As a,b and M are already given, it just rests to determine the Fourier 
transform of h. 

h(k) = 1 (6.20) 

Since B and D obviously fulfill the block-matrix condition above, we can directly 
determine the eigenvalues that we need: 


Af (*) 

A ?(k) 


1 ( 2s 2 

B ” = w d-^ 2 -*) 


= D33 = 


M33 


s 

3N 2 


where ic(k) = cos(ki) + cos(k 2 ). Thus we get: 

^2 


iM k ) ) = 


es 


6(1 - f (2 - k )) 

The smallest s for which the denominator vanishes is then determined by: 


1 


2s 2 

IT 


(2 — k) = 0 => s 2 (k) = 


3 

2k -4 


The position of the minimum of s(k) coincides with the position of the minimum of 
K(k) in respect to k. The minimum of k is thereby —2. Thus the stability limit is 


( 6 . 21 ) 


Thus, we obtained the same result as for the limit of small neighborhoods in the 
Gaussian case above. 
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6.3.2 ^Hexagonal (3,6)-map 

We will now examine the first of the two other possible regular maps of the Eu¬ 
clidean plane, namely the (3,6)-map. Using the geometric calculations done in the 
appendix, we now obtain for small deviations of one node a shift of the centroids 
of the voronoi cells of each neighboring node in the feature space given by: 


hy = S(f' — f) 


- S(f+ei-f) 



x( -> . e\ . 

S{r+ — + —e 2 -r) 


xf ~, . U - 4 , 

${r + -~ —e 2 -r) 


xf ~, U . V3-, - 4 ^ 

S{r-- + —e 2 -r) 


xf ~, U _ 4 . 

— - —e 2 -r) 


- S(f+e i -f') 


l 

6 

1 

3\/3 

0 


48 


J_ 

36 

5 

36\/3 

0 

J_ 

36 


0 

0 

— 2g2 

9 b 

1 

9\/3 
* 


0 


0 

0 

-?s 2 

9 b 


36v/3 

0 


36 


36v/3 

0 


1 

9\/3 

* 

0 

1 

9\/3 

* 


36 
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36/3 

0 


+ 


0 

1 

9/3 

* 


-?S 2 

9 b 

0 

0 

— =g 2 

9 b 


0 


0 

0 

-?s 2 

9 b 


48 


0 

0 

-?s 2 

9 b 


The three-dimensional Fourier transformation for fl(fc )33 then yields: 

fl(l) 33 = + 

4s 2 ,r, ,? \ M , v/3 r ,k x V3 r 

= —(3 — cos (A:i) — cos(— + —k 2 )-cos(— - —k 2 )) 

4 s 2 -* 

= -f(3-*(*)) 

where k is a substitution to simplify the notation. 

Analogously we obtain for the z-component of fr(fc): 

b(kh = 0 
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Given these results we can again easily confirm that the matrices B and D have 
indeed such a diagonal block structure as we have claimed. We can thus easily 
calculate the stability limits for the three classes of neighborhood functions that we 
listed in 1.3. 


Long-ranged, Gaussian neighborhood function 

We start by determining the Fourier of the Gaussian transform given in Eq.1.2. We 
get 

rr 2 k 2 

h(k) = Ino 2 exp(-—) (6.22) 


Case cr 2> 1 


By inserting a(k) 33 ,b(k) 3 and h(k) in Eq.6.17 we obtain A® (k), but before, we will 
simplify k(k) by observing that in the case of large a either exp(— cr 2 k 2 ) is very small 
or /C ] and k 3 are sufficiently small and we can expand the trigonometric functions to 
leading order without causing a significant error: 


ii(k) = COs((Cl) + COs(y + f^-k 2 ) +COS(y - 

» d-fj + d-^^ + a-^#^) 


= 3 


kl 


kl 3 kl 


3k 2 


=3- 


Hence we get for A® (k): 


Itzct 2 . 4s 


k 2 (r 2 .. Irca 2 s 2 k 2 


k 2 a 2 , 


A 3 (k) = B 33 = -^-(1- — (3-k)exp(-—)) = -^(1- — exp(-—)) 


N 2 


2 

(6.23) 


and analogously for A 3 (k) 


a 3 (k) = D 33 = 4n N 2 Mexp(—k 2 cr 2 ') = 47 ^/ exp (~k 2 a 2 ) 


(6.24) 


Using Eq.6.16 we now get for the correlation of the corresponding eigenmode am¬ 
plitude: 


{u 3 {k) ) = e 


2 . ncr 2 s 2 exp(—k 2 cr 2 ) 


3(l-^exp(-^)) 


k 2 a 2 ' 


(6.25) 


Once more, we now have to find the smallest s for which the denominator van- 
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ishes: 


2 s 2 k 2 kV 2 2 3 ,kV 2 

6na 2 (l - — exp (- —))=0<O-s =^ 2 ex P(^-) 


s(*) = ^ ex P(^j“) 


To get the minimum, the derivative of s(k) in respect to k has to vanish: 


/k*V 2 VS<J 2 ,k*V 2 ! 

=* S (/C ) = _ 2^ eXp( ^ ) + ^ eXp( ^ )= ° 


=> 

k* = — 

=> 

S* =0 -i/— « 2.02cr 


a 


V 2 


Case <7 —t 0 


The case of small a can be tackled the same way as done for the (4,4)-map. We first 
notice, that B and D are blocked-shaped and diagonal and thus obtain now for the 
fluctuation (using approximation of Eq.6.23&6.24 for small a): 


(w 3 (£) 2 ) 


= end 2 


s 2 exp (—k 2 a 2 ) 
3-^(3 -it) 


And again, we have to find the null of denominator in respect to s to determine 

s*: 


3 - iH 3 ~ *>= 0 =*■ s(E) = ViS* ^ 


since by using global optimization methods we get inf^(k(k)) = —1.5. 

Thus we obtain a higher stability limit for this hexagonal map as above for the 
square case in the limit of small a, while the results for the very large a are the 
same. 



Short-ranged NN neighborhood function 

Here we have to determine the Fourier transform of the NN-neighborhood func¬ 
tion. Therefore we have to adapt Eq.1.3 to fit in this case of six nearest neighbors 
at the (relative) coordinates (±|,±^),(±|,=f^) and (±1,0). Thus the Fourier 
transformation result in: 

h(k) = J e lkx h(x)d 3 x = 1 + (exp(z'(y ± ^k 2 )) + exp(-z(y + 

+ (exp(z(y - ^-k 2 )) + exp(-z(y - ^h))) + (exp(zfc x ) + exp(-zfc 1 )) 
= 1 ±2cos(y + k 2 ) ±2cos(y - ^k 2 ) ±2cos(ki) = 1 ±2zc(k) 
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where ic is the same substitution that we used above. By Eq.6.17 we obtain for 

A|(fc) 




7-(l+2S)is 2 (3 



Analogously for A^ (k): 


A 3 d (^)=D33 = 3^(1 + 2 k ) 2 

Thus we obtain in this case for (u 3 (k) 2 ): 

s 2 (1+2k) 2 


(u 3 {k) ) = e 


6 [7- (1 + 2k)|s 2 (3 — ic)] 


Similar to the other cases we now get s* by determining the null of the denominator 
and find the minimum in respect to k, or to be exact, to k. 


s\k) 


63 

8k 2 - 20ic - 12 



So the stability limit is slightly larger than for the square case as we expected. 15 . 


Ultra-short-ranged VQ neighborhood function 

For the final case we can again compute s* straightforward. We already have com¬ 
puted h for the square case above in Eq.6.20 and since M 33 does not differ likewise, 
D has same third eigenvalue as in the square case: 

A?® = 3 ^ 

For the corresponding eigenvalue of B we get: 

A i ( k ) = B33 = ^(1 -« 33 ) = ]^ 2 '( 1 — 9 ~( 3 _ ^)) 

Thus we obtain for the fluctuations: 


(u 3 (k) 2 ) = e 


3 —s 2 (4 — f) 


s 2 (k) = 


4-Ik 


15 Here, we don't need to determine the exact wave number k. We just have to note that | is within 
the range of k(k) 
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As in the square case, the minimum of the stability limits s(k ) is determined by the 
minimum of K which is —1.5. Thus we get: 



which is again the same result as in the Gaussian case for small a. 


6.3.3 ^Trigonal (6,3)-map 


Finally we will take a look at the last remaining possible regular map, namely the 
(6,3)-map. Since the geometric calculations are in this case due to some asymmetries 
quite annoying we restricted them to the needed results for the deviation and the 
shifts in z-direction. Thus we got: 


/ * * 0 

5(r' — f) J * * 0 

\0 0 fs 2 

${r-- + —e 2 -r) 




-5(r + e i -f) 

* * 0 \ 

* * 0 

oo Is 2 J 
* * 0 \ 

* * o 

o o ^s 2 / 


* * o \ 

* * o 
oo Is 2 J 


The three-dimensional Fourier transformation a(k )33 therefor is: 


4s 2 4s 2 4s 2 ... e\ 4s 2 e\ 

33 = -3 -— exp(zcifc)-— exp(z(-- + — e 2 )k) - — exp(i(~- 

4 s 2 

= - 9 -( 3 -«) 


-^2 


where k is once more a substitution to simplify the notation. 
Furthermore we have for the z-component of b(k): 


b(k) 3 = 0 

As in the hexagonal case, we can easily confirm that the matrices B and D have in¬ 
deed such a diagonal block structure and we can, again, calculate the stability limits 
for the three classes of neighborhood functions. The only small difference, which 
we have to consider, is, that R is here complex-valued. Apart from that, most of the 
calculations can be shortened as they are the identical or at least very similar. 
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Long-ranged, Gaussian neighborhood function 
Case <Ji^> 1 


First, we simplify R (k) or, to be more exact, in the real part ft(k(k)') 16 for the limit 
of large cr: 


»(*(*)) 


K(exp(ieifc) +exp(z'(-y + ^e 2 )£) + exp(i(-y - ^e 2 )fc)) 
cos(?ifc) +cos((-y + ^e 2 )k) +cos((-y - ^ e 2 )k ) 

2 4 4 4 


Since the eigenvalues A| (fc) and A^*(/c) are now given by 


27TCT 


4s 2 


fcV 2 , 


A 3 (*) = -B 33 = —^(1-— (3-^)exp(-—)) 


N 2 v- 9 v- -v—rv 2 

A°(^) = D 33 = ^^Mexp(-fcV 2 ) = ^^exp(-fcV 2 ) 


3N 2 


we get here for the fluctuation 

s 2 exp(— k 2 s 2 ) 


(u 3 (fc) 2 ) = £7rn 2 


= £7TC' 


s 2 exp(—/< 2 s 2 ) 


= £7rn 


3 • 3f£(l — ^-(3 — f) exp(—3 — ^-(3 — K(£)) exp( — 
s 2 exp(— k 2 s 2 ) 


3- ^exp(-^ 


fcV 2 ' 


This result is identical to Eq.6.25 of the (3,6)-map. Thus the stability limit s* has to 
be also the same as for the other two maps: 


k 2 cr 2 
2 > 



Case a —> 0 

Since kappa = 9ft(ic), both A 3 (k) and the real part of A|(fc) are identical to their 
counterparts for the (3,6)-map. Thus, the stability limit for small a is again given 
by: 



16 since we are only interested in the denominator of (uT,(k) 2 ). 
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Short-ranged NN neighborhood function 

The Fourier transformation of the NN-neighborhood function adapted for the tri¬ 
angular map is: 


h(k) = 1 + exp{ie 1 k) +exp(z'(-y + ^e 2 )^) +exp(i(~ - ^e 2 )A:) = 1 + R 


Thus we get: 


B33 = 


N 2 




(9 — 3s 2 — 2s 2 K + s 2 k 2 ^ 


9 N 2 


= ^^2 |1 + S 33 | 


For the fluctuation of the eigenmode amplitude we obtain: 

es 2 |l + B 33 | 2 


(« 3 (k) 2 ) = 


r (9 — 3s 2 — 2s 2 !R(k) + s 2 3?(k) 2 ) 


=> s 2 (k) = — 


3?(k) 2 -2K(k) -3 


As s 2 (k) is minimal for ft(k' ) = 1, the wanted stability limit s* finally is: 


s* = - = 1.5 
2 


So this is the smallest stability limit for the NN case of all three regular maps. 


Ultra-short-ranged VQ neighborhood function 

At last, we will briefly determine the stability limit for the case of the vector quan¬ 
tization. Since 

^33 = ^(!-^( 3 -«)) 

and 



are very similar to the corresponding components in the hexagonal case (even K 
and 5 J(k) are identical), we get by the same calculations the same result: 
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6.4 Summary of the analytic results 


Tab.2 summarize all the results that we have obtained in the preceding section in 
detail (including resulting stability limits of the square map for NN and Gaussian 
neighborhood, which have been derived in [RS88, sec.5]). 


map 

Gauss (large a) 

Gauss (small a) 

NN 

VQ 

(4/4) 

(7y/f « 2.02(7 

sjl ~ 0-612 

f'i « 1.549 

m 0.612 

(3,6) 

CTy/f « 2.02(7 

m 0.707 

L604 

m 0.707 

(6,3) 

ay/% » 2.02(7 

yjl m 0.707 

\ = 1.5 

\j\ m 0.707 


Table 2: Results of the analytic stability analysis 


It can be noticed that for the three choices of maps, the results for the long-ranged 
Gaussian with a 3> 1 are the same. This may result from the fact that, given a 
winner node, the neighboring nodes of it are all adapted in the same way as the 
neighborhood function hardly differs there and in large distances where the neigh¬ 
borhood finally decreases significantly, the differences of the lattices from the dis¬ 
tant viewpoint of the winner node become blurred as the differences between the 
distances from these outer.nodes to the winner are all relatively small. 

The stability limits for the Gaussian function with very small a are, as may have 
been expected, identical to the one obtained by Vector Quantization. The (4,4)-map 
thereby exhibit a smaller limit than the other two maps. 

The use of nearest neighborhood functions results in stability limits that grow with 
the number of nearest neighbors, i.e. a SOM using the hexagonal map with 6 nearest 
neighbors at each node has the highest stability while the triangular map with half 
the number of nearest neighbors yields to lesser stable SOMs. A possible explana¬ 
tion may be that the "visibility range" of a node in respect to inputs is enlarged by 
the size of the voronoi cells of its neighbors. Thus, the "height" of the input space 
becomes smaller compared to the spread in the other directions. It is the same ef¬ 
fect that we can notice in the case of the Gaussian neighborhood where the stability 
rises proportional to u. 

In the following chapter 7, we are now going to verify these result using a numerical 
approach. 














Chapter 7 

Numerical Analysis / Monte Carlo 
Simulations 


In chapter 6 we have analytically determined the stability limit of the case of both an 
Euclidean map and feature space where the feature space exceeds the two-dimensional 
map space by one dimension. In the following we will not only try to verify these 
analytic results, but furthermore extend this analysis to the case where we have a 
hyperbolic map space (i.e. HSOM) and, in addition, an hyperbolic feature space (i.e. 
GRiSOM). Again, the readers in a hurry may be referred to the last section of this 
chapter, where the results of this analysis are very briefly summarized. 

Before starting to discuss the simulation in detail, it has to be noted that the used 
software library (cf. chapter. 5) has been optimized to support an easy intuitive im¬ 
plementation of map and feature spaces. Thus, the SOM algorithm is implemented 
in the most general way and could therefore not be specially optimized for partic¬ 
ular choices of spaces, maps or neighborhood functions. That means, that, for each 
adaption step, it has to explore and adapt the whole map which may take a lot of 
time depending on the number of neurons in it and, since time is a limited resource, 
this drastically restrains the configurations of the SOM that we are able to examine. 

In first testing runs it has been observed that each of the available processor cores 
(single cores of Intel(R) Core(TM)2 Duo CPU E6550 @ 2.33GHz) was capable of 
adapting about one billion neurons per hour using an Euclidean map and feature 
space and a long-range neighborhood function 17 . By using an hyperbolic map space 
the speed does not differ significantly as we still use Euclidean geodesics, but in the 
case of using hyperbolic map and feature spaces this decreases even down to only 
25 millions per hour. This led to the dilemma that in order to avoid possible bound¬ 
ary effects the map had to be as large as possible but at the same time the sample 
set has to be large enough so that there are enough samples in each Voronoi cell of 
the neurons to ensure that the map has enough "time" to adapt to the distribution. 
Therefore we had to choose a trade-off that bounded the computing time of the re¬ 
sult for one parameter set at maximal 2-3 hours. Fortunately, the whole exploration 
of the parameter space could be easily parallelized, because the result seldomly de¬ 
pended on each other. Thus the whole search could be rolled out onto the about 
available 20 workstations (i.e. 40 processor cores). That reduced the duration of the 
whole numerical work to "merely" several weeks instead of many months. 


17 The use of NN and VQ neighborhood functions would, of course, boost the speed 
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7.1 Test procedure 


To determine the stability limits of the SOM configurations numerically Monte Carlo 
simulations of the SOM algorithm are carried out. That is be done by generating 
samples by using the according uniform distribution as defined in chapter.4 which 
is then used to adapt the neurons using the SOM algorithm. By doing so, the SOM 
will adapt itself to the given sample set embedding itself into it. 

In order to determine a particular stability, we have to consider all the possible 
parameters on which the result can depend. In general, these are: 

• type of map space 

• type of feature space 

• type of regular map i.e. tessellation, which is subdivided into: 

- type of polygon (number of vertices) 

- number of neighbors at each vertex 

- edge length 

• neighborhood function (here: Gaussian, NN, VQ) 

• adaption parameters £ and a 

• distribution of samples, or to be more exact 

- type of distribution 

- boundaries is map dimension 

- parameter s for boundaries is extra dimension 

• size of sample set 

It has to be remarked, that several of these parameters depend on each other. So 
determines the type of the map and the feature space not only the possible choices 
for the tessellations (cf. Ch.3) but also the type of distribution of the samples that 
is used as listed in Ch.4. Furthermore is the size of the sample set restricted by the 
available resources (i.e. computing time and storage space) as mentioned above. 
The used values for the various maps are listed in tables 3,8 and 9. 

The most important parameter is, of course, s. By fixing all other parameters and 
varying s we now determine the stability limit s* i.e. the smallest value for s where 
the equilibrium state of the SOM that lies in a plane in the map dimensions is no 
longer stable. In the analytic approach we examined the fluctuations in the extra di¬ 
mension to determine, if the equilibrium is still attractive. Here, we use a different 
method based on the Mean_Extra_Dim_Analyzer introduced in the evaluation sec¬ 
tion of chapter 5. In other words, we measure the mean deviation U 3 or, to be more 
exact, the normalized deviation %/ s from the initial equilibrium state, which is the 
equilibrium for (s = 0) in direction of the extra dimension 18 . As long as the ini¬ 
tial equilibrium is attractive, the analytic value of % vanishes. Since we work here 


18 see Ch.5.2.5 for more details about exact definition for non-Euclidean spaces 




7.1 Test procedure 


91 


numerically with a finite sample set, the mean state will show distortions around 
the equilibrium, but these fluctuations and thus the measured errors are very small 
and can be (easily) identified and reduced as they depend on e. By contrast, above 
the stability limit, where the SOM has moved into a new equilibrium state which 
also uses the extra dimension, the measured error is significantly larger and is only 
slightly affected by a variation of e. As an example, Fig.22 shows a graph of the 
error in respect to s. 


(6,3)-map NN N=16 



size of extra dimension s 


FIGURE 22: Transition between former and new equilibrium at the stability limit 

To find s* for a given SOM configuration, we thus have to do a first search run 
to roughly locate it by looking out for significantly large errors. Once we narrow 
the limit down to a interval, we can iteratively refine the search by using smaller 
step sizes for the sweeps until we will have pinpointed the stability limit up to a 
sufficiently small error margin. In Tab.22, the search runs for a (6,3)-map with N=16 
are listed as an example. Fig.23 shows then the corresponding graph. 

For the first runs e = 0.05 was used. In the last run e was then reduced to 0.005, 
which result in a sharper edge as can be seen in the graph. It can also be noticed 
that we did not use ideal nested intervals for the search of s*. In fact, we used about. 
10 steps for each sweep. This may have been slower, but allows a better overview 
about the neighborhood around the position of the stability limit. This is very im¬ 
portant for the cases where the transition between the equilibriums is not such a 
sharp increase but instead "blurred" by accompanying fluctuations. Another ben¬ 
efit of the refinement by a factor 5 or 10 at each run is that the search needs less 
supervision as the sweeps have then a runtime of 12-24 hours. With 30-40 searches 
running parallel and the need to adapt the parameters after each search run, the re¬ 
quired time for the supervision of the whole simulation is thus reduced to a feasible 
amount. The search is now stopped when the stability limit is determined within a 
wanted accuracy. For most of our search runs, this accuracy limit was set to 1-5%. 
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(6,3)-map NN N=16 



Figure 23 


Only for very small values that were obtained by using the HSOM or for very time- 
consuming searches we already stopped the search earlier which resulted in higher 
error margins. The stability limits thus obtained are listed in detail in the appendix 
(cf. Tab.15,16,17). They are the basis for all the evaluations that are carried out in the 
following sections. 


7.2 Verification of the analytical results in the Euclidean case 


As mentioned, we want to verify the analytical results obtained in chapter 7. In 
order to do so, we generate the three types of maps i.e. the triangular, square and 
hexagonal maps of the Euclidean plane. The shape of the maps is a rectangular grid 
of N x N neurons as seen in Fig.12. Instead of working with a constant length of 
the edges as we did in the analytical approach, we fix the area of the 2-dimensional 
map to 1. Thus we use the following normalized edge lengths: 


j(3,6) _ 2 (4/ 4) _ 1 (6,3) _ 2 

NN ~ ^3 N' NN ~ N' NN ~ ^27N 


(7.1) 


In the case of the (3,6)-map, the periodic boundaries (if used) are therefore given 
by — ± in the x-direction and ±in the (/-direction. Analogously for the 

(4,4)-map, we use ±1 (in both directions) and for the (6,3)-map ±~^= and As 
we now embed the map one-to-one into the 3-dimensional Euclidean feature space. 
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the same boundaries are used for the uniform sample set, which we use to examine 
the stability, in two of the three dimensions and, if used, the periodic boundaries in 
the feature space. 



6 x6 

Eucli 

10 x10 

dean case 
12 x12 

- lattice s 
16x16 

ize 

24x24 

32x32 

number of nodes 

36 

100 

144 

256 

576 

1024 

number of meas. per param. 

10000 

10000 

10000 

5000 

2500 

2500 

number of adapt.steps 

20 Mio 

20 Mio 

20 Mio 

10 Mio 

5 Mio 

5 Mio 


Table 3: Choice of the size of the sample sets for Euclidean maps 


7.2.1 Short-ranged NN neighborhood 

Since the distance d\;,v depends on the choice of the map type as well as on its "size" 
N, the range of the NN neighborhood function differs for most configurations. So, 
in order to make the "raw" results, that we obtained by the Monte Carlo simulation, 
comparable, we have to renormalize them in respect to d\i\: i.e. dividing them by 
the corresponding distance between nearest neighbors. These "normalized" stabil¬ 
ity limits are listed in Tab.4 and visualized in Fig.24 19 . 


lattice size 

(3,6) 

map type 
(4,4) 

(6,3) 

6 x6 

1.627(± o.oo8) 

1.548(± 0.006) 

1.525(± 0.007) 

10 x10 

1.592(± 0.007) 

1.540(± o.oio) 

1.504<± o.on) 

12 x12 

1.595(± 0.008) 

1.536(± o.oi 2 ) 

1.504(± o.on) 

16x16 

1.600(± o.on) 

1.536(± o.oi6) 

1.514(± 0.018) 

24x24 

1.595(± o.oi6) 

1.536(± 0.024) 

1.504(± 0.027) 

32x32 

1.600(± 0.021) 

1.536(± 0.032) 

1.495(± 0.036) 

mean 

1.602(±o.oi) 

1.540(±o.oi) 

1.508<± o.oi) 

analytical 

1.604 

1.549 

1.500 


Table 4: Normalized stability limits for Euclidean maps using the NN neighborhood function 

Except a few outliers in the case of smaller grid sizes, these results verify nicely our 
analytical calculations and confirm the conjecture that the stability limit rises with 
number of nearest neighbors (at least in the case of regular Euclidean maps). The 
outliers may thereby result from the periodic boundary conditions, since they put 
constraints onto the whole map and may thereby have an effect on the stability of 
the map. 


19 The corresponding errors have hereby been calculated by standard Gaussian error propagation. 




























94 


Chapter 7 Numerical Analysis / Monte Carlo Simulations 


EUCL_EUCL_PERIODIC - NN 
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FIGURE 24: Stability limits of the three Euclidean maps for the NN neighborhood function 


7.2.2 Long-ranged Gaussian neighborhood 

In the case of the Gaussian neighborhood function, we want to verify the results we 
obtained analytically for the cases of either very small or very large cr and further¬ 
more analyze at least qualitatively the stability behavior in between. 

As noted above, the distances between nearest neighbors strictly depend on the 
type of the map and the size of the grid. So, we again have to renormalize the deter¬ 
mined results. This includes cr, which becomes the relative cr = cr I d\f,\;,and anal¬ 
ogously, is s* normalized in the same way. Fig.25 now plots the results by using a 
as x-values and the quotient of the normalized stability limit and a for the second 
axis. 

We can easily identify the two subdomains we will discuss below. For small a the 
graphs are similar to hyperboles, while for larger <7, they tend to approximate a con¬ 
stant function for rather large <7. As these are the domains that we already analyzed 
in the analytic approach, we will examine them now more closely. After that, we 
will also briefly discuss the domain in between at least qualitatively. 


Case cr 1 

In chapter 6 we have seen, that for very large a the stability limit increases approx¬ 
imately proportionally to a. Thus, we examine here the quotient of the normalized 
stability limit and the normalized range of the Gaussian neighborhood. The results 
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FIGURE 25: Stability limits of the three Euclidean maps for the Gaussian neighborhood function 

thus obtained have been already shown in Fig.25. As we may have expected, the 
approximation error, i.e. the difference between a point on the curves and the con¬ 
stant function at the stability limit for f >1, decreases exponentially in the domain 
of large a. This can be easily checked by plotting the logarithm of this error, which 
has been done in Fig.26 for the hexagonal map. 

Thus, we can extrapolate the stability in the limit ir » 1 by fitting the curves in the 
domain of large a in respect to an exponentional function a ■ e bx + c. The values for 
c thus obtained are the desired results, which are now listed in Tab.5 20 . 



map type 

u 

(3,6) | (4,4) | (6,3) 


periodic boundaries 


0.1 

2 .01(±o.oi) 

1.99(±o.oi> 

1.99<±o.o2) 

0.2 

2 .00(±o.oi) 

1.92(± o.oi) 

1.78(±o.oi) 


non-periodic boundaries 


0.05 

1.98 (±0.02) 

1.75(±o.io> 

1.82(± 0.09) 

0.2 

— 

1.50 (±0.02) 

— 

analytical 

2.02 


Table 5: Extrapolated stability limits for Euclidean maps using the Gaussian neighborhood function 
with large a 

We see that the analytic results are well verified for the hexagonal maps with and 
without periodic boundaries and for the square and the triangular map in the case 

20 we hereby only consider the parts of the curves for a = 0.1 and cr = 0.2, which satisfy a > 1. 
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EUCL_EUCL (3,6)-map / Gaussian neighborhood 



FIGURE 26: Logarithmic plot of the approximation error for a hexagonal (3,6)-map 


of periodic boundaries and (7 = 1. For the other cases we observe some significant 
deviations in respect to the analytic results. The lower stability limits in the case of 
missing periodic boundary conditions may be caused by the fact that the neurons at 
the "loose" edge have far fewer neighbors. As we have discussed in the summary 
of the analytic results, the size of the neighborhood does have a direct impact on the 
stability of a SOM. The lower stability limits here seem to provide further evidence 
for this assumption. The reason for the lower stability of the SOM configurations 
using (7 = 0.2 may be first not so evident, but a possible explanation may be found 
by taking into consideration, that a is very large in respect to the size of the whole 
feature space, which has, as is known, a base area of 1. We actually imposed bound¬ 
ary conditions to avoid edge effects. But this does work only locally i.e. the winner 
neurons only cooperate with one of the infinite many representatives, which are a 
result of the periodic condition. Thus the cooperation always only works on a vol¬ 
ume which is identical to the bounded space i.e. the neighborhood cannot be larger 
than the space itself. Thus, if the range of the neighborhood is increased beyond 
this limit, the size of the neighborhood and therefore, as we assume, the stability 
remains the same and thus the quotient that we examined above becomes signifi¬ 
cantly smaller. Thus, also this observed effect may back up our assumption. To be 
more certain, it would be necessary to perform further studies of SOMs with such 
large-ranged neighborhood functions. 
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Case <7 —>■ 0 

Due to results of the analytic analysis in the preceding chapter, we expect, that, for 
the case of small (relative) cr, we observe a stability limit which, unlike to the other 
limit, does not depend on a. To get sufficient small a we restrict our evaluation of 
the numerical results to the two smallest choices of a and the lattices with a rather 
large edge length. According to Eq.7.1, this are those with fewer nodes. Fig.27 and 
Tab.6 show these results. 


lattice size 

a 

(3,6) 

map type 

(44) 

(6,3) 



a norm, s* 

<7 norm, s* 

a norm, s* 


periodic boundaries 


6 x6 

0.025 

0.099 

0.703(± 0.004) 

0.150 

0.738(± o.on) 

0.171 

0.827(± o.oi4) 

10 x10 

0.025 

0.165 

0.711 (±0.007) 

0.250 

0.710(± 0.020) 

0.285 

0.809(± 0.023) 

12 x12 

0.025 

0.197 

0.719(± o.oi6) 

0.300 

0.720(± 0.024) 

0.342 

0.834(± 0.027) 

16x16 

0.025 

0.263 

0.705(± o.on) 

0.400 

— 

0.456 

— 

6 x6 

0.05 

0.197 

0.691(± 0.008) 

0.300 

0.720(± o.oo6) 

0.342 

0.834(± o.oo7) 

10 x10 

0.05 

0.329 

0.724<± o.oo7) 

0.500 

1.090(± 0.020) 

0.570 

1.208(± 0.057) 

12 x12 

0.05 

0.395 

0.829(± o.oo8) 

0.600 

1.356(± 0.024) 

0.684 

1.450(± 0.027) 


non-periodic boundaries 


6 x6 

0.05 

0.197 

0.655(± 0.039) 

0.300 

0.696<± o.oo6) 

0.342 

0.827(± o.oo7) 

10 x10 

0.05 

0.329 

0.684(± o.oo7) 

0.500 

1.070(± o.oio) 

0.570 

1.106(± 0.023) 

12 x12 

0.05 

0.395 

0.774<± o.oi6) 

0.600 

1.284(± 0.024) 

0.684 

1.299(± 0.027) 

constant fit 

0.703(± 0.008) 

0.726<± 0 . 006 ) 

0.826(± o.oo3) 

analytical 

0.707 

0.612 

0.707 


Table 6: Normalized stability limits for Euclidean maps using the Gaussian neighborhood function 
with small j 

We now furthermore fitted the data points, which have a cr lesser than 0.35, using 
constant functions and taking the error margins into account. The results thus ob¬ 
tained have also been added to the table and the graph. Remembering the analytic 
results, we have expected, that the stability limit is s* ~ 0.612 for the square map 
and s* ~ 0.707 for the other two maps. But instead, the numerically determined 
stability limits for all but one are a good deal higher. Only the hexagonal (3,6)-map 
shows the stability limit that we calculated before. The analytic value lies even in¬ 
side the error margins of the computed constant fit. As we can not ad hoc pinpoint 
the source of this very large systematic error for the other two cases, we will de¬ 
lay the discussion until the results of the VQ are evaluated to see if this deviation 
only occurs when using the Gaussian neighborhood function or even when vector 
quantization is used. 
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EUCL_EUCL_PERIODIC - GAUSS (sigma small) 
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FIGURE 27: Stability limits of the three Euclidean maps for the Gaussian neighborhood function with 
small cr (only for periodic boundaries) 


Case a "between the limits" 

Even if we have omitted a detailed discussion of the a between the limits, we can 
explain, at least, a basic aspect of the shape of the curves shown in Fig.25. As we 
neglected all higher terms of k 2 and k\ in the approximation for large Gaussian 
neighborhood ranges, these terms would have decreased the s 2 k 2 term in the equa¬ 
tion of Af. Thus, the resulting stability limit increases then. This is the reason why 
the curves are converging to the proportional factor of the limit from above. 


7.2.3 Ultra-short-ranged VQ neighborhood 

The last analytic results, which we still have to verify, are those for the ultra-short- 
ranged neighborhood i.e. vector quantization. In the analytic approach we have de¬ 
termined the same stability limit for the three types of maps (using periodic bound¬ 
aries) as for the Gaussian neighborhood function in the limit of small a. Now, by 
using the simulation we obtain the corresponding numerical results shown in Tab.7 
and Fig.28 (The renormalization is thereby the same as in the NN case). 

The first thing, that can be noticed, is, that these results show the same anomaly 
in respect to the analytic limits as for the Gaussian neighborhood function, i.e. the 
stability limits of both the square and the triangular lattice exceed the analytic value 
significantly. These discrepancies thereby do not show any sign of dependence on 
the grid size, which let us assume that they are not caused by any boundary effect. 
Indeed, the numerical values match the ones obtained in the Gaussian case quite 
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lattice size 

(3,6) 

map type 
(4/4) 

(6,3) 

6 x6 

0.691 (±0.004) 

0.690(± 0.030) 

0.821(4 0.007) 

10 x10 

0.691(± 0.007) 

0.680<± o.oio) 

0.775<± o.on) 

12 x12 

0.687(± 0 . 016 ) 

0.696(± 0.024) 

0.793(± 0.027) 

16x16 

0.684(± o.on) 

0.672(± 0.032) 

0.766(± 0 . 018 ) 

24x24 

0.695(± 0.032) 

0.696(± 0.072) 

0.766(± 0.055) 

32x32 

0.632(± 0.063) 

0.704(± 0.064) 

0.802(± 0.073) 

constant fit 

0.690(± 0.005) 

0.683(± o.oo3) 

0.803<± o.oio) 

analytical 

0.707 

0.612 

0.707 


Table 7: Normalized stability limits for Euclidean maps using the VQ neighborhood function 
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FIGURE 28: Stability limits of the three Euclidean maps for the VQ neighborhood function 


well. The same is almost true for the comparison of the numerical and analytic 
result for the hexagonal case, which may lead to the assumption that either we 
have a numerical error that only shows up for the lesser symmetric grids (we have 
seen such a difference in the behavior of the hexagonal map for a 3> 1 as well) or, 
also possible, we missed in the analytic approach a constraint of k, which may then 
increase the infimum of K,k and k and thus the corresponding stability limits for the 
VQ and the Gaussian neighborhood function with small a. 


































100 


Chapter 7 Numerical Analysis / Monte Carlo Simulations 


7.3 Analysis of configuration with hyperbolic map space 
(HSOM) 


After having a look at the "Euclidean-Euclidean" case above we want now inves¬ 
tigate the stability properties of SOMs consisting of non-Euclidean map or feature 
spaces. More precisely, we will start by examining the case where we use a HSOM 
i.e. having an hyperbolic map space and continue with the case of having addition¬ 
ally a hyperbolic feature space in the next section. 

In chapter 3, we have discussed two possible truncations for the hyperbolic map. 
Due to lack of time as the implementation of software took longer as expected, 
we, unfortunately, had only the time to examine one of them. We decided to use 
the equi-hierarchical method, i.e. we created the map by restricting the depth of 
the hierarchical structure, because this is the method used in founded papers (e.g. 
[Rit99],[ORGG01]) and we therefore obtain results which are comparable and may 
e.g. help to estimate the sensibility of the used HSOM in respect to non-hierarchical 
noise in a hierarchical sample set. 

Since we nonetheless use a disk-like spread of the sample set (cf. section 4.1.3) with 
the radius of 1.5 • ^mN/ this will result in certain deformations of the embedding of 
the map in the feature space, as the non-circular map will try to fill the whole disk 
as best as possible. These deformations may then cause a difference between the so 
determined stability limits and the ones we would gain if we would use the other 
truncation method which would lead to a more circular map and therefore in lesser 
distorsions. 

Nevertheless, we try to determine certain stability properties of the HSOM. 


7.3.1 Stability vs. Size 

One of the most important features of this SOM, which is gained by using a hyper¬ 
bolic map, is the capacity to easily adapt to higher-dimensional Euclidean space as 
the number of nodes at the edge growth exponentially in respect to the number of 
layers/radius. Tab.8 lists, besides the number of measurements and size of sample 
sets we used, the number of nodes in the map and of the nodes in the outermost 
layer. We want to start by examining the dependency of the stability limit on the 
size of the map, or, to be more exact, the number of nodes at the edge. We therefor 
calculate the ratio between the stability limits of maps which differ only in respect 
of the number of layers and compare it to the corresponding reciprocal ratio of the 
number of outer map nodes (cf. Fig.29). 

Even if the ratio of the number of edge nodes for some of the results does not lie 
within the calculated error margins, the relation between the size of the outermost 
layer and the stability is quite clear i.e. the stability limit appear to be inversely 
proportional to the number of nodes in the outermost ring. This corresponds to the 
fact that the HSOM can solve or, at least, reduce the dimensional conflict due to its 
much larger growth rate in respect to the Euclidean extra dimension of the input. 
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(3,7)-{3,4,5j layers 

HSOM case - map 
(3,9)-{3/4} layers 

(4,5)-{3/4/5j layers 

#nodes 

85,232,617 

271,1306 

61,166,441 

#nodes in outer layer 

56,147,385 

216,1035 

40,105,275 

# meas./param. 

2500 

2500 

2500 

# adapt.steps 

3.75 Mio 

3.75 Mio 

3.75 Mio 



HSOM ca 

(6,4)-{3/4/5/6} layers 

se - maps 

(7,3)-{3/4/5/6} layers 

# nodes 

49,133,353,929 

22,40,70,115 

#nodes in outer layer 

32,84,220,576 

12,18,30,45 

# meas./param. 

2500 

2500 

# adapt.steps 

3.75 Mio 

3.75 Mio 


Table 8: Size of the sample sets/maps/layers for HSOM 
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FIGURE 29: Ratio of the stability limits between HSOM maps with different sizes 


7.3.2 Short-ranged NN neighborhood 

It has been verified above, that the stability for the classic SOMs using NN neigh¬ 
borhood functions depends on the type of map. We suggested that the reasons are 
the higher number of nearest neighbors. We want to check, if this conjecture also 
holds when a HSOM is used. To be able to compare the different maps and number 
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of layers we use the knowledge about the dependence of the stability on the edge 
size and thus normalize the stability limit in respect to the number of edge nodes. 
So we get rid of the already known influence of the different edge size and get the 
results plotted in Fig.30. 


DISK EUCL-NN 



number of layers 


FIGURE 30: Stability limits of the five Hyperbolic maps for the NN neighborhood function 

The (3,7)- and (3,9)-maps seem to have the highest stability and there is a gap be¬ 
tween them and the others. It is thereby difficult to determine how large the in¬ 
fluence of the deformations exactly is (see above). For the (3,6)-map with 3 layers, 
it may certainly be negligible as this map is mostly circular, but the others will be 
more effected, which may, for example, cause that the (3,9)-map has a slightly lower 
stability limit than the (3,6)-map or the lower limit of the (6,4)-map compared to the 
(7,3)-map. 


7.3.3 Long-ranged Gaussian neighborhood 

Now, we want to check if the HSOM shows similar dependencies on the size of cr 
i.e. range of the Gaussian neighborhood as the classic SOM did above. We therefore 
determine again the quotient of the stability limit and cr. By normalizing again the 
result by the size of the edge, we obtain the results shown in Fig.31. 

The similarity to the corresponding graph in the classic case (cf. Fig.25) is quite 
obvious. We can again identify the three domains, i.e. the domain of very small a, 
the one with f > 1 and the third one in between. We want to take a brief look at 
the first domain. 
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FIGURE 31: Stability limits of the five Hyperbolic maps for the Gaussian neighborhood function 


Case h —t 0 

For the classic SOM, we had observed that in the limit of small a the stability does 
not depend on a i.e. was constant. To check this for the HSOM, we now have a look 
at the normalized stability limits for a < 0.5 in Fig.32. 

It can be seen, that the stability limits are indeed nearly constant. Furthermore can 
we again notice the influence of the deformations as even the stability limits for 
different sizes of the same map differ significantly. While the (3,6)-map is again less 
effected, the split-up for the other maps is more intense. This also makes a further 
comparison of the different maps futile. 
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DISK_EUCL - GAUSS (sigma small) 



FIGURE 32: Stability limits of the five Hyperbolic maps for the Gaussian neighborhood function with 
very small cr 


7.4 Analysis of configuration with hyperbolic map and 
feature space (GRiSOM) 


At last, we are going to analyze a SOM with a non-Euclidean feature space. As 
mentioned in the introduction of this chapter, the adaption steps for this configura¬ 
tion are much more time-consuming as the program has to compute rather complex 
geodesics instead of simple vector differences. Thus, we only had the time to study 
one class of maps. We chose the (7,3)-map as it has the slowest growth rate. This 
allowed us despite the lack of time to find at least the stability limits for up to 5 
layers and 4 different choices of neighborhoods. 


# layers 

GRiSON 

3 

1 case - (7, 
4 

3)-maps 

5 

# nodes 

22 

40 

70 

# meas./param. 

1500 

1500 

1500 

# adapt.steps 

1.5 Mio 

1.5 Mio 

1.5 Mio 


Table 9: Choice of the size of the sample sets for the Hyperbolic-Hyperbolic GRiSOM 

As there are still too few results to perform a meaningful quantitative analysis, we 
will just briefly focus on the most significant qualitative results, we can obtain from 
the stability limits listed in Tab.10. 
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dNN ~ 0.566 

(7,3) 

neighborhood 

3 layers 

4 layers 

5 layers 

NN 

1.10(± 0.05) 

1.15(±0.05) 

1.180(±o.o5) 

Gauss u=0.05 

0.63(±o.o5) 

0.61(± 0.02) 

0.68(±o.o4) 

Gauss cr- 0.1 

0.65(±o.o2) 

0.62(± o.o2) 

— 

Gauss a=0.5 

1.25(±o.oi) 

1.34(± o.oi) 

1,44<± o.o5) 


Table 10: Results for (7,3)-hyperbolic map in Hyperbolic Poincare disk space 


First we notice, that the stability limit is no longer influenced by the number of lay¬ 
ers as it has been for the HSOM. This is evident since the geometry of the input is 
the same as the map, but its dimension exceeds the map dimension. Thus the SOM 
face the same kind of dimensionality conflict as the classic SOM. Another similarity 
to the classic SOM is the observed height of the stability limit. If we normalize, for 
example, the stability limit for the NN neighborhood function in respect to the edge 
length of the map, we get approximately s* (normalized) = 2.0 which lies still rel¬ 
atively close to the corresponding limits of the classic case. Even more obvious are 
the similarities for the Gaussian neighborhood function. As the edge length is rather 
large in respect to a, the relative range a is very small (~ 0.1). The relative range for 
the two smaller choices of Gaussian neighborhoods is thereby lower than the limit 
of 0.35, below which we observed the constant stability limits in the classic case (cf. 
section 7.2.2). The same constancy can be found for the Hyperbolic-Hyperbolic case, 
even if the stability exceeds the found classic ones by a factor of 2. As even a = 0.5 
does not result in a large relative range, but are not able to study the other limit. 
This and as well the VQ case would be therefore an interesting subject for further 
future studies. 


7.5 Summary of numerical results 

To provide a final overview, the results, which have been obtained by evaluating 
the numerical simulations, have been listed in Tab.11. 


Classic SOM 


map 

Gauss (large a) 

Gauss (small a) 

NN 

VQ 

(3,6) 

(± 0.00) 

0.703(± 0.008) 

1.602(±o.oi) 

0.690(± 0.005) 

(4/4) 

(± 0.00) 

0.726(± o.oo6) 

1.540(±o.oi) 

0.683(± o.oo3) 

(6,3) 

(± 0.00) 

0.826(± o.oo3) 

1.508(±o.oi) 

0.803(± o.oio) 

Hyperbolic-Hyperbolic G 

RiSOM 

(7,3) 

m 1.13 — 

« 2.01 

— 


Table 11: Results of the analytic stability analysis (N ou t el —number of edge nodes) 

For the classic SOM we could therefore verify most of the analytic results as most 
of the discrepancies between the numeric and corresponding analytic results could 
be explained. Only stability behavior for the Vector Quantization still remain to be 
studied further (cf. Conclusion). 
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The main problem of the evaluation of the HSOM was the underlying effect of the 
deformations caused by the choice of the truncation method. Nevertheless we could 
show that the stability limit depend on the number of edge nodes and that all maps 
exhibit the same stability behavior for the Gaussian neighborhood function that we 
have seen for the classic SOM before. 

The evaluation of the Hyperbolic-Hyperbolic GRiSOM finally does not provide 
much informations as we only had determined the stability limits for just one map, 
but we at least could deduce that the stability limit does not depend on the size 
of the map as one would expect since the map and feature space have the same 
hyperbolic geometry. 


7.6 "travelling ruler problems" 


We will conclude this chapter about the numerical analysis by returning to the three 
"travelling ruler problems" which we have presented in the motivation. After defin¬ 
ing and implementing the GRiSOM we have finally the means to solve all of them 
using the numerical simulations. We therefore use a map space that is homeomor- 
phic to the S 1 . In this space we embed N = 50 neurons regularly. The feature space is 
chosen such that it meets the requirements of the particular problem. Furthermore, 
we use a Gaussian neighborhood function and the adaption parameters following 
parameters in all three cases 

ffl 

e = 0.1 (7 = 10 • dist.neurons • 0.02 #ada P lsle P s 

where m is the index of the current adaption step and #adapt.steps = 10000 is the 
number of adaption steps. The large cr in the beginning allows intense deformations 
of the initial path to form the rough shape of the final path and as the range of 
the Gaussian neighborhood function decreasing over time, the path becomes more 
and more detailed. The positions of the cities are thereby used as the input samples 
randomly picking at each step one of them. To give a better visualization, the results 
were finally plotted into the maps that we used to motivate these problems. 


Emperor Frederick I Barbarossa 

In this first case we work with an Euclidean feature space with no boundaries. As 
mentioned, the samples, that we use to train the SOM, are the positions of the cities 
are given by the coordinates in the map (i.e. the coordinates of the respective pixels 
in the image of the map) as shown in Tab. 12. 
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city 

pos 

X 

ition 

y 

Frankfurt 

201 

-298 

Roma 

334 

-708 

Jerusalem 

1100 

-1095 

Gallipoli 

790 

-772 

Buda 

540 

-445 


Table 12: cities and their positions in the Barbarossa TRP 



FIGURE 33: Traveling ruler problem: (left) init (right) final path 


Pres. Barack Obama 


For the case of the TRP of Pres. Obama, we now have an non-Euclidean feature 
space i.e. sphere. Again, the samples, that we use to train the SOM, are the positions 
of the cities, which are this time given by the geographic coordinates on earth as 
shown in Tab. 13. 


city 

posit 

longitude 

ion 

latitude 

Washington D.C. 

38N 

77W 

Berlin 

52N 

13E 

Jerusalem 

31N 

35E 

Moskow 

55N 

37E 

Nairobi 

IS 

36E 

Beijing 

39N 

116E 


Table 13: cities and their positions in the Obama TRP 


Using the GRiSOM with a Spherical feature space, gives us the shortest path shown 
in Fig.34. 
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Cities • 
Path nodes 

Path - 



FIGURE 34: Traveling ruler problem: (left) init (right) final path 


It can be noticed that all the neurons lie perfectly on the geodesics that connect the 
cities. The minimal flight distance needed to visit these 5 cities by following the 
linear map, is 35 232,3 km. 


President Kse'nu 

The last of the three problems was situated in the hyperbolically-curved space, 
where we have to minimize the flight path of Kse'nu pan-galactic cruiser on its tour 
to visit several galaxies located at the (fictional) Poincare disk coordinates listed in 
Tab. 14. 


galaxy 

pos 

X 

ition 

y 

Milky Way 

0 

0 

M31 

-0.4 

0.8 

NGC 6822 

0.1 

0.65 

Sombrero Galaxy 

-0.8 

-0.4 


Table 14: galaxies and their (fictional) positions in the Kse'nu TRP 

By training the GRiSOM for this data, we obtain the shortest route plotted in Fig.35. 

Thus, we have also finally achieved our primary goal to solve the three travelling 
ruler problems despite the non-Euclidean feature spaces. 
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FIGURE 35: Traveling ruler problem: (left) init (right) final path 




Part V 

Epilogue 




Chapter 8 

Conclusion and Outlook 


Motivated by the three "travelling ruler problems", we have defined in this thesis 
the General Riemannian Self-Organizing Maps as a modification of the classic SOM 
in order to be able to use more general map and input spaces. In addition, we have 
provided a concrete implementation of this GRiSOM for at least Hyperbolic and 
Spherical map and feature spaces, which enabled us to finally solve the TRPs in 
their underlying spaces. 

Our second goal was also achieved as we studied both the triangular and hexago¬ 
nal Euclidean maps and hence finally got an overall view on the stability properties 
of SOM with arbitrary regular Euclidean maps in regard to the three used classes of 
neighborhood functions. We were also able to obtain numerical results concerning 
the stability of non-Euclidean SOMs as we studied the HSOM and moreover got a 
glimpse of the stability behavior of a Hyperbolic-Hyperbolic GRiSOM configura¬ 
tion which was not possible with any other formerly defined SOM. 

All things considered, this thesis raises many new and interesting questions and 
subjects to be studied in future works. First of all, the gap between the numeri¬ 
cal and analytic stability limit for the square and hexagonal Euclidean maps when 
using very small neighborhoods has to be further analyzed to find the specific rea¬ 
son for it. Secondly, it would be very interesting to extend the stability analysis of 
the HSOM and GRiSOM by using e.g. the distant-based map truncation to get rid 
of probable deformation effects or even perform an analytic approach. One of the 
most interesting subjects can the extension of this generalization of map and input 
spaces on related techniques like the vector quantization. First tests on spherical in¬ 
put spaces have, for example, shown that in the case of only few quantization vec¬ 
tors the representation errors is often smaller if we use the inherent space instead 
of mapping the space onto an Euclidean space and using then the classic Euclidean 
VQ technique. While this can be credited to the better suited competition process, 
the following short example shows, how VQ can also benefit from the modified 
update rule using moves along geodesics. 


"N-valley problem" 

We assume that we have given a landscape with n concentric circular valleys, pair¬ 
wise separated by high mountains. The metric is then given by the cost to travel 
there. While is quite easy (i.e. cheap) to reach any point in the same valley, it is 
quite expensive to visit any other point outside. To simplify the definition of the 
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geodesics, we assume that the travelling costs inside a valley are thereby very small 
compared to the cost for travelling the mountains and that the valleys are also very 
narrow. Thus, the corresponding line element is approximated by 


ds 2 ~ dr 2 + r 2 


1-E*( r -*i)] d© + r 2 ej^5{r - Ri)d© 


where r and © are the polar coordinates with the origin in the common center of 
the circular valleys, R, is the radius of the zth valley and e the ratio between the cost 
of moving in the valley and in the mountains. The geodesics therefor exists and are 
(almost) everywhere unique 21 . They are simply combinations of segments of the 
circles forming the valleys and almost straight lines perpendicular to them. 

Our goal is now, to quantize samples that are distributed only in the valleys. The 
top left plot in Fig.36 shows the initial positions of the quantization vector which 
have been randomly drawn from the area bounded by the outermost valley. First, 
we used the classic Euclidean VQ. The result can be found in the plot in the top 
right showing the positions of the quantization vector after a training phase with 
10000 samples and the Voronoi tessellation in respect to the Euclidean metric. Al¬ 
though the result is stationary, it is not a good quantization for intrinsic structures 
of the valleys. Now, we modified the competition process to take the metric, we de¬ 
fined above into account. This leads to the result plotted in the bottom left of Fig.36 
where the Voronoi cells are now defined by the metric of the valleys. While one 
outermost vector represents nearly all the valleys, the rest of them is located near 
the origin. This state is not even stationary for finite learning rates as it happens 
that the outermost vector, by following an Euclidean line, gets closer to the origin 
than the outermost of the rest of the vectors. In this case, the roles between both are 
swapped. Finally, both the competition and adaption process have been modified 
as described in for the GRiSOM, which will finally provide the desired quantiza¬ 
tion. It is note-worthy, that, given the same number of vectors as valleys initialized 
as described above and a sufficient long training phase (or large enough learn rate), 
this quantization is always obtained. If a vector quantize more than one ring at a 
given state, it is certain due to the ergodicity of the stochastic VQ process, that the 
vector will be at one point in time further away from the innermost valley repre¬ 
sented by it than a vector lying closer to the origin. This will allow the latter one 
to take over this inner valley. Thus the vectors "trickle" towards the outer valleys 
until each of them represents one valley. 


Even if the conditions that have specified are very particular, this example serves 
well to show the possibilities that are inherent to our proposed modification and 
may motivate future work on this field of study. 


21 Only for points which are antipodal in the same valley or lie in different valleys the geodesics are 
not unique. 




115 




FIGURE 36: VQ in "N-valley problem" (top left) initial state (top right) classic VQ (bottom left) VQ 
with modified competition and classic update process (bottom right) VQ with modified 
competition and update process 
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Appendix A 

Calculations/Proofs 


This chapter contains the most important (longish) calculation that have been skipped 
or shortened in chapter 6 due to lack of space. We start with closing the gap in the 
derivation of the Fokker-Planck equation, and list then derivations and geometric 
calculations which had been skipped in chapter 6, but are not too obvious. The 
proofs of the theorems in chapter 4 are finally concluding this chapter. 


A.l Derivation of the Fokker-Planck-Equation 


By inserting in the Chapman-Kolmogorov equation the transition probability (Eq.6.2) 
we get 

S(w, f + 1 ) = f d N zv'Q(zv,zv')S(zv',t) = Y] [ d N zo f dvP(v)5(zv — T(zv' ,v,e))S(zv' ,t) 

J r J JF r (w r ) 

To simplify the integral we'll now substitute zv := T(zv 1 , v, e). Therefore we need the 
reciprocal Jacobian determinant 


dT r , 


dzv r / n 


(diagonal matrix) 


d(zo rm + eh%(v m - zv r 
dzv r i n 

det(<S,y(l — eh%) d ) 
n(l “ eh °rs) 


= 3 rr >(l - £h° rs ) 


where s is the winning neuron for sample v in the state zv'. Thus we can now easily 
perform the integration by zv and resubstitute zv. 


S(zv,t + 1) ( = I J( £ ) dz 


[Sfct.) 


E /w 

r 


‘ dvP(v)5(zv — z)S(T L (z,v,e )) 

F,(T- 1 (z,o,e) 

dv P(v)S(T^ 1 (zv / v,e)) 


22 


'F r (T 1 (zv,v,£)) 


f dvXr(T 1 (iv,v,e),v) 
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where Xri w > v) is the characteristic function of the Voronoi cell of neuron r: 


Xr{w,v) 


1, if v E F r (zv) 

0, else 


Thus we obtain 


=> zv r 
=> w' r 


[T(zv r ,v,e)] r = zv' r + eh%(v — w' r ) = zv' r (l — eh%) + eh%v 


T 1 (zv,v,e) 


zv r — ehy S v 
1 - eh° rs 


ZV r 


1 - eh% 


— £ 


h° 

ii. rs 


1 - eh% 


ZV r ~ £ll r <:V 


=:h r . 


h o p. 

where we have introduced the function h rs := 1 _2 J o which difference to // rs is only 
of an order of e. Since e is small, we can expand S(zv',t ) and likewise /(e) and ob¬ 
tain 


S(T \zv,v,£),t) 


Taylor at w 


+ 


dS 


S(zv,t) + J^{ T 1 {zv,v,e) . -zv rm )^—(zv,t) 


r,m 


hX^O T 1 ( w ' v ' £ ) ~ w rm)( T 1 (w,V, e) 


rm r'n 


ZV r 'n) 


d 2 S 




(zv,t) + 0(£ 3 


dS 


S (zv r t ) ^(0 (zOyif/ -f- chyfi (zv 

rm Vm ) -zv rm )~ - (zv,t) 


T n, Xj Xj £ h rs {w rm Vm)h r i s (zV r in 


dTUpjfi 

d 2 S 


rm r'n 


dw rm dzv r f n 


(zv,t) 


and 


/(e) = 1 + e^| £= o + 0(e 2 ) (Taylor of " at 0) 


£=o + C^(£ 2 ) 


1 | 

[n,.(i -eh°rs )] d+1 

rir'^r(l — ^r's) I _ , m l 


= 1+ed^ 


h° 

IL rS 


(i - £h%) iir'M 1 - £ ~^ S )( n f (i - ~<y 


|g=o + 0(e ) 


= 1 + £d hrs + O (e 2 ) (tran I- inv - } l + £ d £ hro +0 (e 


r 

-• ; /i 


22 Subst. T(zv',v,e) = z => dw' = dzj(e ) 
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h rs could be replaced by h r o since we sum in j\ over the whole set of neurons. Com¬ 
bining these result with the result of the integral, we get 


^(S(zv,t + 1) — S(zv,t )) ^ ^ 


/ dvx r (T 1 (zv,v,e),v)P(v)S(T 


- S(zv,t )) 

= ^ ^(1+ e/i + 0(e 2 )) J dvXr{T^ 1 {zv,v,e),v)P(v)(S(zv / t) 


dS 

+ D( w rm + eh rs (zv rm Vm ) ~ w rm)^—{w,t) 

rm oWym 

1 d 2 S 

+ 3 T\Y\z 2 Ks{Wrni - V m )h r ' s (w r ' n ~V„)- --- (zv,t) + 0(£ 3 )) - S(zV,t ) 

' OZVrmOZVr’r, 


rm r'n 


=0 


= -((D [ dv Xr(T 1 (zv,v,e),v)P(v))S(zv,t) - S(zv,t)) 

£ c J 


=1 


dS 


+ V. / dvP(v)Vh rs (zVrm-Vm k -( ZV,t)+hV dvP(v ) 

s Jf s (w) rm OZVrm s JF s (w) 


=1 


£ ( d 2 S 

+ / dvP(v)Y^Y J £ 2 h rs (zv rm -v m )h r ' s (zv r > n -v n )—— - (zv,t) 

s JF s (w) rm r i n OZV rm oZV r i n 

+ 0(e 3 ) + -J 2 §(zv,t) 


(A.2) 


If we are in or rather very close to a stationary state, the probability to leave it should 
be zero i.e. S(w, t ) is peaked around a zv which is chosen such that 

J dvP(v)T(zv,v,e) — zv = 0 

Thus, we define 

S(u, t ) := S(zv + u, t ) 

For the following it is convenient to have following quantities defined: 

:= zv rm v sm )h rs P s {v ) =$■ V r (zv ) = v s ^)h rs P s ^zv') (A.3) 


(w mi - i> sm )(av n - p s „)P s (w) + / (u m - - v sm v s „P(v)dv 

JF s (w) 


k)rmr f hrsh 

s 

D rr ,{zv) = y\h rs h r f s (zv r - v s )(zv r i - v s ) T P s (zv ) + / (uu T - v s vJ)P(v)dv 
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In the limit of small £, we may obtain then a first Fokker-Planck equation for our 
problem: 


S{u,t + 1) — S{u,t) St 3° d f S(u, t) 

(A. 2) 1 . . , , ^ , , dS(u,t) 

=> - S(u,t ) = JlS(u, t) + Y^ v rm{w + U )~A 1 - - 

£ rm ou rm 

£ ^ , _ x d 2 S(u,t) e T . 

+ 2 E D rmr , n {zv) di + 

nnr'n rrn r n 


We are now going to find more convenient notation of it by finding other expression 
for the first two terms on the RHS: 

As the first order term represents the restoring force, is has to vanish at u = 0 
which yields 


VW/ ( — i \dS(u,t) (Taylor) 

L, v ™( w + u )-£:—- 


-E^S(«,()+ E ^ps(u,tM 

TTtl 


rm dlVrm 


=0 


+ E(V„(ffi) + E (®)«r „ + 0 (x 2 )) 3S< “' ,) 

rm s ---' r 'n OW r i n 


du r 


= 0 ' 


,23 


(chainrule) 


^dV rm ^ 9 ( dV rm 

~Ljc^ s ( u ' t )+ E — ( 5 T^OK'«S(u,f) 


rm du> nn 


rnir'n ^ Urm V ^ZV r ' n 


if £ < 1 


Vr(w) ( = } Y](zv r - v s )h rs P s {w ) = Y](zv r - . 1 ) / dvP(v)v)h rs 

s s ^O) 

= Y2(P s (zv)iv r — / dvP(v)v)h rs = Y^ dvP(v)(w r — v)h r 

q JFs(zv') g J Fg^zo') 


S 

T (w, v,e) r = zv r + eh% (w r — v ) 


, , \ w r — T(zv,v,e)r £«1 1 , s X 

^ hrs{zv - V) = -- , n x ~ -O,. - T{ZV,V ,£) r ) 

e(l — e«u ) £ 


VrO) = E/ dvP(v)-(zv r — T(zv,v,e) r ) = - f dvP(v){zv r — T{zv,v,e) 
s Jf s (zv) £ £ J 


Thus we obtain: 


E™ gi =’ » L„„ f dvP(v)( 1 - gj) = 1 /<fcPM - -y) 

rm ULU rm 

=Tr(l~E) 

And by using the fact that def (1 + X)«l + XifXis small, we get: 

=> /(e) = (1 — eA) + O (e 2 ) = 1 — elf A + 0(e 2 ) 


23 Indeed, the restoring force vanishes in the equilibrium state 
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Comparison with coefficients, of Eq.A.2 yields 

7(c) = 1 + eji + d(c 2 ) 

=*■/. = -Trd = - STr(cA) = iTr(l - g) { U ] g l = \j dvP(v) eh = h 

= 1 

Thus we obtain the final form of our version of the Fokker-Planck Equation: 


d tS(u,t) = JiS(u,t) + 


dV n 


E oVrm r / ,% 


rm dlVrm 


+ Y2 ( jr JI1 (.‘ iD ) u r l nS(u,t) 

rmr'n ^ U rm \dW r ' n 

, e v-. , _ x d 2 S(u, t) 

Ty D r mr'n{w)~^ ^ 

Z .i i (J li-YfYt(j Llyl yt 


( 


= hs-hs+ £ 


1 dUr 


=0 


\ 


^Z!L(zv) u r i n S(u,t) 

OWffyj 


\ 

\ mD rmr'n 




£ sr^ t -\ d 2 S(u,t ) 

T 2 Xj ^ rmr'n fa) 2 
rmr'n 


d w 31/ jc/ ^ 


A.2 Various calculations 


Proof of Eq. 6.11. Initial cond.: S(u,0) = d N (u — Uq) 


Crmsn(0) — J dllS fa, 0) (Wj-m (0) U rm (0))fa sn (0) ll sn fa)) 

= farm, 0 n rm ,o) fasn,0 M-sn, o) = 0 

dtU s i(t) = £ ^| B slr'n H r' n = ^ > df w(f) = E ‘ ufa) 


The solution of this differential equation is given by: 
=>- wo exp B J e(r)dr^ = Y(t)uo 
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The 2nd moment can then be deducted by: 


' u s'v) — £ ( r. lrm(Urm (0 ' M s '/') + Bs'l'r'n (MsZ (0 ' Uf'n) ) T £ Dsis'l 1 

\rm r'n ) 

dtC sh , v — 3f ((zz s ; ■ m s '/') (Ms/)( m s //')) — dt(u s i ■ u s ii>) (9f(u s /))(u s ///) (M s /)(9f(u s '/')) 

= B I r. ^slrtn (Urm (t) • M s /;/) T" y~) Ps'l'r'n^Msl (0 ‘ ^r'n) J T B ^sls'l' 

\ rm r'n J 

( Jj^slrm^rm^s'!' ( £) ^_ ( Ps'l'r'n^-r'nHsi 


— ^ ( r. Bslrmi SP-mi (t) • M s /;/) (M ra; (t) • M s /;/)) + Ps'l'r'n ( (Ms/ (^) ‘ U-r'n) Mg;M r ' n ) 


T£ Dsls'l' — £ I ^slrmCrms'l' T ^^C s i r i n B r r ns rji J T £ Dsls'l' 

\rm r'n ) 


= —£ 


((BC) 

sZs'Z' + (CB T ) s ;g/;/^ + £ 2 D sZs'Z' 


3 f C(f) = —e(t)(BC + CB t ) + £ 2 (t)D() 

C(f) =: SY(t)C* (t)Y T (t) = e -Bf‘e(T)dT C *^ e -B T f‘e(T)dT (interaction picture) 


□ 


Proof of Eq. 6.12. 

W dfC(t) = -B£e~ B fo< T ) dr C*(t)e- BT 

+ e~ B fo £ ( T ) dT (d t C* (t))e~ BT So £ ^ dr + e ~ B So £ ( T ) dT C* (f) (-B T s)e~ BT So £ ( T ) dT 

A 9 f C(t) = -£(t)(B(e~ B So £ ^ dT C* {t)e~ BT So £ Wd^ 

(e~ B So /r)dr c * (fy-Bfj e(T)dr^ B T^ + £ 2 ( f ) D 

e _B Jo e (r)dr (0 fC * (f) )e _sT So £ ( T ) dT = £ 2 (t)D 
d t C*(t) = e 2 (t)e B So £ O dT De BT So £ ( T ) dT 
C*(f) = C*( 0 ) + [* dre 2 (T)e B So T£ ( f ) df De BT So T£ ( f ) df 


+ 
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Proof of Eq. 6.13. now £ constant, B, D (and therefore D, Y) commute: 

Y(f) ^ dre 2 e B df)dt De BT lo Y T (t ) 

e 2 Y(t) f dre^ B+B ^ ^ £ ^ df Y T (t)D 
Jo 


c {6 ± 2) 


e 2 Y(t) ^ B + BT ^ 1 e ( B+Br )/o < f ) df Y T (t)D = e(B + B T )~ 1 D 


=y-i(t)(r T (0)- 1 


□ 


Proof of Eq. 6.14. L.H.S. of (6.14): 

J dvP(v)[T(w,v,e] r — zv r = —w r + J dvP{v)w r + j dvP(v)eh%(v — w r ) 


£«0 


=1 

£ ^ h rs / dvP(v)(v — w r 

s Jf s (w) 


= £ r h rs / dvP(v)v — / dvP(v)w r 

< JfJw) JfJzv) 


=N~ 


= eJ^h rs N 2 (w s - w r ) = 24 0 


□ 


A.3 Geometric Aspects of regular Euclidean maps 


In this section we will briefly determine some geometric properties that we need in 
ch.6. Given a "center node" and its nearest neighbors of a triangular and hexagonal 
map 25 , we deduce the the shift of the centroids and the volume change of the corre¬ 
sponding voronoi cells i.e. feature sets for small deviations of the center node. Thus 
we obtain the matrices a and b which are are then used in section 6.3 to analytically 
pinpoint the stability limit. 


A.3.1 Calculating shift of centroids 

To determine the shift of the centroids, we move the center node in the x-,y- and 
z-direction of the embedding three-dimensional Euclidean space by the infinitesi¬ 
mal distance d. We then calculate the centroids of the voronoi cells in respect to the 
changed cell borders by integrating the slices of the cell perpendicular to the re¬ 
spective direction (with area A(h)) weighted by the coordinate h of the slice in the 
examined dimension and then normalizing the result by the volume of the cell. As 


24 follows since h rs depends only on Distance between r and s and periodic bounding condition 
25 The analogous calculations for the square map was already covered by [RMS94]. 









132 


Appendix A Calculations/Proofs 


d is very small, we restrict our calculations to the linear response i.e. we will ignore 
all the terms that contain powers of d that are larger than 1. 



FIGURE 37: center node and neighboring voronoi cells in the (left)(3,6)-map, (center)(4,4)-map and 
(right) (6,3)-map 


(3,6)-Tesselation 

As we have seen in chapter 3, each node of the (3,6)-map has six nearest neighbors. 
Fig.37 shows hereby the "center node" and its neighboring nodes. 

x-direction 

The first case that we want to consider is when the center node is moved slightly in 
the x-direction (cf.Fig.38). 



1+d 



FIGURE 38: Move of center node into x-direction 


Center in x-direction: 
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(1) M><) = j s ('-T l 


\/3 
1—d2+4d 


= ++'+T 


l-d\l-2d 


( 2 ) A W = w + 7# + 1 ^^- 2 ^-^) : vr 

= !-^+ 2rf + (1 ~ d ^~ M) - v/2^4rf 

_ 2+1—d—2d _ 2-4d j ; _ 3-3d _ 2-4d ^ 


+3 


vs 


+3 


+3 


JhA(h)dh _ [^-^T + l^+^Jp -> _ 

J A(h)dh _ d )|j + ; 7 2(! + 2 d)] 0 ^ + [(3 - 3 d)h - h 2 { 1 - 2 d)]L 


f^h 2 - §d 3 (l - 2d) 


11 


x = 


1—3d 


+ j2 (1 — 3d) (1 + 2d) + 


3—3d 2-4d 


- |(1 - 3d) + i(l - 3d)(l - 2d) + Od 2 


1-2 d , 1 


Hi - 2d)(l + 2d) + 3 - 3d - (1 - 2d) - §(1 - 2d) + 1(1- 2d)(l - 2d) + Od 2 


1 + y+OcP = 1 1 J+0J2 

2 18 


| + Od 2 



Center in y-direction: 


Ay = o 


Lower right in x-direction: 

(1) A(h) = j. + h^ + j.) = j. + 2h^) 



(2) A(h) = j- + ^ + (h- 1)( 


+3 


+3 

2hd 

+3 


1—2d _1_ \ _ 2-d+d _ 2M 

+3 VS' +3 +3 


(3) A(h) = ^ - ^(ft - ^) = 2 ~ rf t 1+rf - i/i 


= ^~7S h 


\/3 


■s/3 


+3 


X 


f hA(h)dh 
f A(h)dh 


h 2 2h 3 1-d 
. 2+3 3 +3 J 0 


J+ _ 2 7.3 d 
+3 3 n 73 


ll+5 


+ 


^Jz 2 


Jl 

+3 


+ h 21 - 


+3 J o 


+ 


2 h 
\/3 


/z 2 rf 

>/3. 


l±d 


V3- 


TT + I + I - 1 ~ § (1 + 2d) + \ (1 + 3d) 


d_ 

12 


l + l=^+d + 3-1- ^ 


l+2d 

4 



_+_/,3 

3+3 M 




11 


1 

l±d 

2 


\/3 


l+d 

2 
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Lower right in y-direction: 



(1) A(h) = 2hV3 

(2) A(h) = 1 


(3) A(h) = 1 - (h- 

^3 ft(fEi) 




2\/3 


i + l( 


2—2d 
1—2d 


)- 


y 


f hA{h)dh 


" 2/z 3 " 

2V3 , 

h 2 

,V3. 

0 

2 


3 

2\/3 

1 

2\/3 


4-rf 

4-5dh 2 h 3 (2~2d\]2V3 
1—2d 2 ^/3 V1—2d / j 3 

2^3 


j A(h)dh 


V3h 2 


i 

2\/3 

0 


+ [*] 


3 

2y/3 

1 

2\/3 


4^d 

4—5d / /3h 2 ( 2-2d \ 2\/3 
1—2d" 2 \ 1—2d / j 3 

J 2a/3 


_2 | 9_J I (16—8)(4—5d) _ (4-d) 2 (2-2d) 

24 "r" 24 24 ^ 2-12(1—2d) 24-3(l-2d) 


1 , j_ , (4—d)(4—5d) _ (16-8d) N /3(2-2d) 

4\/3 ' -/3 ' 2\/3(l-2 d) 24(l-2d) 


1 _ 83rf 

2 72 = VTv^ 


\/3 _ 13 j 
2 4 ^ 3 “ 


1 

9^3 


d 


±4^51 I 27(2—2d) 
24 1—2d “ r 24-3(1—2d) 

3 4—5d , 9\/3 2—2d 
2 v ^l-2d“ l “ 24 1—2d 



Analogously: 
Lower left: 


1 , 


Ax = —d 

A 

36 


Ay = + 


9^3 


Upper left: 



1 , 


1 


Ax = —d 
36 

A 

ON 

1 

II 

'>7 

< 


Upper right: 


1 , 


Ax = —d 

A 

36 


Ay = 


+ ^T d 

9^3 


Left: 
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x = 


f hA(h)dh 
jMhjdh 


if. + Ik 3 

2 ' 3" 


JO 




1-i 


[h + h 2 y 6 + [3 h - h 2 y; 


1 I 1 1 3(1 d\2 2 /-I d\3 3 I 1 

8~ r 12^2 > -- L 2/ 3' 1 2/ 8 Z 12 

1 + 1+3 — |rf — (1 — |) 2 — l + l 

1 h 
2~6 d 



A 


Aij = 0 


Right: 



A 


Aj/ = 0 


y-direction 

Now we want to see what we get when the center node is moved infinitesimally 
into the y-direction (cf.Fig.39). 




FIGURE 39: Move of center node into y-direction 
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Left in x-direction: 



(!) A ^ = T3 + fs 

(2) A(h) = j s - ^(h -l) = y/3-^ 

= 73- Afr- ( /j -(5 + fr)) 

8 73+lM+l-2^ _ (d+ _i_ )ft 


^ + 27 S) - 


X = 


f hA(h)dh 
f A(h)dh 

ft 2 1 2/d_ 

273 ^ 373 


73ft 2 

2 


2 ft 3 
3/3 


1- 


2+3 


873+15++1-27M 2 

12 


. Ill - (d + -J-tt 
2 ' ^ 2\/3 3 


1 + 


2+3 


1_d 
1 2+3 


Jl _i_ Li 

73 %/3 


J 0 


+ 




1 1- 

1 

2 


2\/3 


8y/3+15tl+l-2\/3ri 2 


1 1+ 


12 


fc -( rf +zJs)¥ Ja . 


2+3 

d 

2+3 


1/m + 3 d + 0(d 2 ) 
0(d 2 ) 


4/3 

6 

473 




1 

2 


3^3 


G(d 2 ) 



Lower right in x-direction: 



(1) A(h) = ^ 

( 2 ) ^ 
(3) A(h) = j- 




_i 

73+2rf 



_ _3_ 1 2(73+7) 

273 2 73 3+273tf 


X 


f hA(h)dh 
f A(h)dh 


A 1 + _L_/*3 
273 ^ 373 


1 

2 

Jo 


Li 

73 



Jl + Li 2 
_ 73 73. 0 


2ft 

73 



3ft 2 

273 


2(73+7) "I 1 
3 3+273d l + _4_ 

2 + 2+3 


JS_ 

73 


U 2 73+d 1 
3+273d J 1 


+ 


d 

2\/3 


(3 473 ^ 3/3 + 2 d) 3d 3 “ 12 + 4 3 “ 12 _ tzVS + jfd 

(| — +^)(\/3 + 2d) — \/3 — d + ^ + |d \ V^+-^d 

^ + ^ + ^ 2) 
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Lower right in y-direction: 



(1) 

A(h) 

= 2hV3 

(2) 

A(h) 

= 1 

(3) 

A(h) 

- 1 (" M >-I 77 


2d) 


(4) 

A(h) 

= 5 1 -j E d-h(V3-2d)-{h-%{2V3~2d) =4 


77 - 

- 2h(y/3 — d) 


V 


f hA(h)dh 
f A(h)dh 


1 -y/3—10d 

p 2v / 3 _|_ p 2V3(V3-2d) 

2p3 


VS 

2 

V3-Wd 

2v / l(V3-2i i) 


+ 


2 



2 


1 V3-10 d 

p 2^3 _|_ p 2v^(V3-2 d) 

2\/3 


[] y/3-KM + [] ^ 
2\/3(V3-2d) 2 


a y = 


z-direction 



FIGURE 40: Move of center node into z-direction 


Center: 
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3 w 


A(h)= 2 V3h 2 ^ + [ivh} 2 _£ = &(1 -2ds + 2dh) = & -2y/3d(s-h) 

JO 2\/3 


z = 


f hA(h)dh 
I A(h)dh 


^h 2 — \/3 dh 2 s + ^d 


l 2s 


JO 


- 2V3dhs + h 2 dV 3 


l 2s 


JO 


\/3s 2 — 4v / 3ds 3 + -%ds 3 4 

—=-—-— s ~|~ —ds 2 

y3s-4 v / 3ds 2 + 4y3ds 2 3 


4 7 

Az = -si 


the other voronoi cells yield: 

A(h) = 


Iz /z 2 ’ 

_ 1 _ 

1 

2 

'3h h 2 ' 

1 

l 

' 3h h 2 ' 

_ y/3 y/3. 

O 

4 

_V3 y/3. 

~r 

1 

2 

. V3 V3_ 


1 1 rx. & 3 

+ —= V3u-= - 


1 __1_ /- 

2v/3 ' 4\/3 v ~~ >/3 2\/3 ' 4v/3 ~~ 2^3 + 

, AT 1 2d(h-s) V3 ,1 


where 0 = 1 — ^ + 4 = 1 — d(/z — s) 


z = 


JhA(h)dh 
f A(h)dh 


w-iA- f> 


2/2 — d( ]j iy — sh) 


3s 2 — d(|s 3 - 2s 3 ) 
3s - d(2s 2 - 2s 2 ) 


= s — 



(6,3)-Tesselation 

For the (6,3)-map we restrict the calculations of the shifts to those resulting by a 
move of the central node in the z-direction (cf.Fig.41) 


z-direction 

Center: 

A(h) = w(h) ■ = 3\Z3(j- + d(h — s)) 2 = 3\/3(^ + d(h — s)) + 0(d 2 ) 

v 3 2 4 


<N IOn 
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FIGURE 41: Move of center node into z-direction 


where w(h) is the height of the triangle of the slice at height h. 


z 


JhA(h)dh 
f A{li)dh 


'h 2 | rife 3 dsh 2 1 

L 8 + 3 2 Jo 

l + ^-hds ] 25 


s 2 + 8d£_ lds 3 4 

4 —p--—— = S + ~ds 

2 + 2ds 2 — Ids 2 3 



For the centroids of the three neighboring voronoi cells, we get: 


A(h) 


f hA{h)dh 
f A(h)dh 



dh + ds ) 


where v = 1 — \ E\ = 1 — d(h — s) 


z 


fhA(h)dh 
J A(h)dh 


3h 2 dh 3 i dsh 2 

[ 8 3 + 2 Jo 

~3h-f +dsh \ 2S 


3 4 + ^S 3 

| - Ids 2 + Ids 2 




All the centroids of the other cells like the blue one in the figure are only shifted by 
an order of d 2 or higher, which can be neglected as mentioned. 
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A.3.2 Calculation the volume change 

To shorten the discussion here, we will again focus only on the case of a move of 
the center node in z-direction. 


z-direction 

By looking closely to Fig.40 and 41, we can easily conclude that for each of both 
maps, the volume of the neighboring voronoi cells only changes by a term of 0(d 2 ). 
Indeed, this can be seen if consider that the deformation is point symmetric about 
the point Therefore an upper bound for the volume growth can be given by 
sd ■ I ~ 0(d 2 ). Since this holds for all the finite neighboring voronoi cells, the vol¬ 
ume deficit of the center voronoi cell is of the same order and can be therefore 
neglected. 


A.4 Proofs 

Proof of Thm. 4.2.1. 1) For all x G 1R: 

P(F-\U ) < x) = P(inf {y G ]R|F(y) = U} < x) = P{U < F(x)) = F{x) 


P(F(X) < u) = P{X < F~\u)) = F(F^ 1 (u)) = u 


Hence this theorem is completely proven. 


□ 


Proof of Thm. 4.2.3. 1) Take a Borel set B C A and let B x be the section of B at x. 

By Tonelli's theorem follows then 



and since the area of A is c, the first part has been proven. 

2) It's sufficient to show that P(X G B) = J B f(x)dx holds: 

P(XGB) = P((X,H) G B 1 = {(x,u)\x G B,0 < u < cf{x)}) 



□ 





A.4 Proofs 


141 


Proof of Thm. 4.2.4. Let B be an arbitrary Borel set. Then we obtain what we want by 
calculation 


OO 

P(Y G B) = 

i =1 

OO -1 

= E(1 - e A n B) = 1 _ (1 _ p) P(Xi e A n B) 

P(x 1 e Af]B) 

P 


If Xiis uniformly distributed in Aq then we get 


P(Y G B) 


PjX, G Af]B) 
P 


PjXx G Af]B) 
P(X G Aj) 


L 


A 0 f]Af]B 


dx 



h 0 dx 

lA 0 f]A dx 


I a n b dx 

f A dx 


which concludes this proof. 


□ 











Appendix B 
Software Manual 


This manual describes how to install and use the software, that was implemented 
in the course of this thesis and is included as source code on the CD-ROM which is 
attached to the printouts. 


B.l System requirements 

To install and use the GRiSOM software library, the following requirements have to 
been fulfilled 

• ISO conform C++ compiler (e.g. g++ of the GNU compiler collection) 

• Boost C++ Libraries [boo] installed 

• MAPM [RinOl] (version >4.9.5) and MAPMX [Pri] Libraries installed (see be¬ 
low) 

• make utility to install it using the Makefile 

• (for visualization:) gnuplot installed 

• (for creation of API:) doxygen installed [vH08] 

B.2 Installation manual 

Before starting to install the GRiSOM software library, make sure that the system 
requirements above are met. Both the MAPM [RinOl] and the MAPMX [Pri] Li¬ 
braries are thereby included on the CD-ROM. They can be found in the subfolder 
<CD-directory>/GRiSOM/libs. Just unpack the two zipped tar files and follow the 
instructions in the README and INSTALL files. To install now the GRiSOM libraries, 
just perform the following steps 

1. Unpack the sources (GRiSOM. tar. gz) located in the subfolder <CD-directory>/GRiSOM 

2. Call make libs to compile the sources. This will create the static libraries 
libgrisom_core. a and libgrisom_aux. a as well as the corresponding shared 
libraries libgrisom_core. so. * and libgrisom_aux. so. * 
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Appendix B Software Manual 


3. make install will try to copy the static library files and headers into the direc¬ 
tories specified in the Makefile. By default, this is /usr/local/lib for libraries 
and /usr/local/include/grisomfor header files. 26 Alternatively, make shared_install 
can be used to copy and install the shared libraries instead of the static ones 
(needs root permissions!). 

Now the libraries are installed and ready to be used e.g. for compiling the simula¬ 
tion program main_test used as the simulation program in the numerical stability 
analysis as well as any testing program of the software package (cf. section 5.3). 


B.3 Programs 

The following programs are provided by the GRiSOM package. 

B.3.1 Simulation program main_test 

The simulation program main_test is a tool to study the stability limit of a SOM 
configuration. It can be compiled by simply using the command 

make simulation 

It is then executed by passing a set of 12 arguments (separated by whitespaces) to 
it to specify the configuration of the SOM and the search procedure: 

- ctype of spaces> (available: EUCL_EUCL, EUCL_DISK, DISK_DISK) 

- <first Schlaefli symbol, defining shape of cells in the map> (integer) 

- <second Schlaefli symbol, defining number of neighbors at each vertex in the 
map> (integer) 

- <type of neighborhood function> (available: GAUSS, NN, VQ) 

- <size of epsilon> (double) 

- <size of sigma> (double) [is ignored if NN,VQ is used] 

- <start value for search> (double) 

- <step size of search> (double) 

- cnumber of steps> (integer) 

- <number of measurements per step> (integer) 

- cnumber of adaption processes between two measurements> (integer) 

- ceuclidean map: number of nodes along one edge, hyperbolic: number of lay- 
ers> (integer) 



B.4 API 
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FIGURE 42: execution of main_test 


Fig.42 shows such an execution of main_test. 

According to the passend arguments, it simulates a GRiSOM with both Euclidean 
periodic map and feature space, a hexagonal map with a 24x24 grid and a Gaussian 
neighborhood function with a = 0.025. The learn rate e of the adaption is thereby 
set to 0.005. The whole search run consists of 15 steps starting at s = 0.048 with 
a step size of 0.001. In each of this steps 2500 snapshots, which lie 2000 adaption 
steps apart, are presented to the Mean_Extra_Dim_Analyzer, which then calculates 
the equilibrium state by averaging the states in the snapshots and computes then 
the error according to the deviation of the equilibrium state thus obtained and the 
reference equilibrium at s = 0. The equilibrium states for each search step and the 
results for the error are then saved in files in the subdirectory of . /data/ specified 
by the string in the line above the starting time. 


B.3.2 Testing programs [tester_*] 

The testing programs like tester_tess and tester_stochastics including their 
source code are provided here as instructive examples how to use several features 
of the software library. They can be furthermore easily adapted to test extensions 
like new distributions or spaces. Just make sure, that you added the location of 
your library extension in the INC_PATH (if necessary) and the library name in the 
EXTERN_LIBS variable in the Makefile and call make tester_<name> to compile the 
testing program. 


B.4 API 


An already generated pdf version of the API can be found in the GRiSOM/API- 
subdirectory, but if you want to get a HTML version or maybe only the API of 
the core library, just use make f ull_api and make core_api, respectively, to create 
them. 


26 


For further detail be referred to the commentaries in the Makefiles. 














Appendix C 

Simulation data/results 


neighb.fct. 

6x6 

10x10 

12x12 

16x16 

24x24 

32x32 


periodic boundary conditions 


NN 

0.412(± o.oo 2 ) 

0.242(± o.ooi) 

0.202(± o.ooi) 

0.152(± o.ooi) 

0 . 101 (± 0.001) 

0.076<± o.ooi) 

Gu=0.025 

0.178(± o.ooi) 

0.108(± o.ooi) 

0.091(± 0.002) 

0.067<± o.ooi) 

0.054(± o.ooi) 

— 

Gu=0.05 

0.175(± o.ooi) 

0.110(± 0.001) 

0.105(± o.ooi) 

0.114(± 0.002) 

0.105(± 0.005) 

0.102(± 0.005) 

Gu=0.10 

0.211(± o.ooi) 

0.225(± o.ooi) 

0.220(± o.oo5) 

0.211(± 0.002) 

0.204(± o.oo 2 ) 

0.201(± 0.002) 

Gu=0.12 

— 

0.264(± o.oo5) 

— 

— 

— 

— 

Gu=0.20 

0.443(± o.oo 2 ) 

0.418(± 0.002) 

0.412(± o.ooi) 

0.406(± 0.002) 

0.405(± 0.002) 

0.403(± 0.002) 

VQ 

0.175(± o.ooi) 

0.105(± o.ooi) 

0.087(± 0.002) 

0.065<± o.ooi) 

0.044(± o.oo 2 ) 

0.030(± 0.003) 


no boundary conditions 


Gu=0.05 

0.166<± o.ooi) 

0.104(± o.ooi) 0.098(± o.oo 2 ) 

0.100(± 0.002) 

0.100(± 0.002) 

0.098<± o.ooi) 

Table 15: Results for (3,6)-map 

neighb.fct. 

6x6 

10x10 12x12 

16x16 

24x24 

32x32 

periodic boundary conditions 

NN 

0.258(± o.ooi) 

0.154(± o.ooi) 

0.128<± o.ooi) 

0.096<± o.ooi) 

0.064(± o.ooi) 

0.048(± o.ooi) 

Gu=0.025 

0.123(± 0.002) 

0.071(± 0.002) 

0.060(± 0.002) 

— 

— 

— 

Gu=0.05 

0.120<± o.ooi) 

0.109(± 0.002) 

0.113(± 0.002) 

0.110(± 0.002) 

0.104(± 0.002) 

0.102(± 0.002) 

Gu=0.10 

0.229<± o.oo2) 

0.215(± o.ooi) 

0.212(± o.oo5) 

0.207(± o.oo3) 

0.201(± 0.002) 

0.200(± 0.002) 

Gu=0.12 

— 

0.250(± 0.005) 

— 

— 

— 

— 

Gu=0.20 

0.394(± o.oo2) 

0.392(± o.oo5) 

0.392(± o.oo2) 

0.388(± o.oo 2 ) 

0.385(± o.oo5) 

0.385(± o.oo 2 ) 

VQ 

0.115(± 0.005) 

0.068(± o.ooi) 

0.058(± o.ooi) 

0.042(± o.oo 2 ) 

0.029(± 0.003) 

0.022(± o.oo 2 ) 

no boundary conditions 

Ga = 0.05 

0.116(± 0.001) 

0.107(± o.ooi) 

0.107(± 0.002) 

0.100(± 0.002) 

0.095(± 0.002) 

0.093<± o.ooi) 

Ga = 0.2 

0.336(± o.ooi) 

0.320(± 0.002) 

0.318(± 0.002) 

0.313(± 0.002) 

0.307(± o.ooi) 

0.305(± o.ooi) 


Table 16: Results for (4,4)-map 
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Appendix C Simulation data/results 


neighb.fct. 

6x6 

10x10 

12x12 

16x16 

24x24 

32x32 

periodic boundary conditions 

NN 

0.223<± o.ooi) 

0.132<± o.ooi) 

0.110(± 0.001) 

0.083(± o.ooi) 

0.055(± o.ooi) 

0.041(± o.ooi) 

Gu=0.025 

0.121(± 0.002) 

0.071(± 0.002) 

0.061(± 0.002) 

— 

— 

— 

Gu=0.05 

0.122<± o.ooi) 

0.106(± 0.005) 

0.106(± 0.002) 

0.104(± 0.002) 

0.104(± 0.002) 

0.102(± 0.002) 

Gu=0.10 

0.214(± o.oo 2 ) 

0.214(± o.ooi) 

0.210(± o.ooi) 

0.205(± o.ooi) 

0.202(± o.oo 2 ) 

0.201(± 0.002) 

Gu=0.12 

— 

0.250(± 0.005) 

— 

— 

— 

— 

Gcr-0.20 

0.370(± o.ooi) 

0.365(± o.ooi) 

0.364(± o.ooi) 

0.362(± o.ooi) 

0.361 (io.ooi) 

0.359(± o.oo 2 ) 

VQ 

0.120(± o.ooi) 

0.068(± o.ooi) 

0.058(± 0.002) 

0.042(± o.ooi) 

0.028(± 0.002) 

0.022(± o.oo 2 ) 

no boundary conditions 

Gu=0.05 

0.121(± o.ooi) 

0.097(± 0.002) 

0.095(± 0.002) 

0.094(± o.ooi) 

0.094(± o.ooi) 

0.091 <± o.ooi) 


Table 17: Results for (6,3)-map 



(3/7) 

(3,8) 

neighbourhood 

3 layers 

4 layers 

5 layers 

3 layers 

4 layers 

NN 

0.209(± 0.002) 

0.085(± 0.002) 

0.032(± o.ooi) 

0.049<± o.ooi) 

0.010(± 0.001) 

Gauss (7=0.05 

0.081<± o.ooi) 

0.034(± 0.003) 

0.013<± o.ooi) 

0.023<± o.ooi) 

0.004(± o.ooi) 

Gauss (7=0.1 

0.085(± 0.002) 

0.034(± 0.002) 

0.013<± o.ooi) 

0.023<± o.ooi) 

0.004<± o.ooi) 

Gauss <7=0.5 

0.104(± o.ooi) 

0.040(± o.ooi) 

0.015(± o.ooi) 

0.023<± o.ooi) 

0.004<± o.ooi) 

Gauss <7=1.0 

0.235<± o.oo5) 

0.093(± o.ooi) 

0.035<± o.ooi) 

0.033<± o.ooi) 

0.007(± o.ooi) 

Gauss <7=2.0 

0.512(± 0.003) 

0.223(± o.oo 2 ) 

0.090(± 0.005) 

0.088(± 0.002) 

0.022(± o.oo3) 

Table 18: 

Results for triangular hyperbolic map in euclidean cartesian space 



(4/5) 



neighbourhood 

3 layers 

4 layers 

5 layers 



NN 

0.114(± o.ooi) 

0.044(± o.oo 2 ) 

0.0165(± o.ooi) 



Gauss <7=0.05 

0.123(± 0.002) 

0.047(± o.ooi) 

0.015<± o.ooi) 



Gauss <7=0.1 

0.120(± o.ooi) 

0.047(± o.ooi) 

0.015<± o.ooi) 



Gauss <7=0.5 

0.120(± o.ooi) 

0.042(± o.oo 2 ) 

0.015<± o.ooi) 



Gauss <7=1.0 

0.198(± 0.002) 

0.070(± 0.005) 

0.023<± o.ooi) 



Gauss <7=2.0 

0.470(± o.oio) 

0.195(± 0.005) 

— 



Table 19: Results for square hyperbolic map in euclidean cartesian space 



(7,3) 

neighbourhood 

3 layers 

4 layers 

5 layers 

6 layers 

NN 

0.342<± o.oo3) 

0.260(± 0.003) 

0.182(± 0.002) 

0.115(± 0.005) 

Gauss (7=0.05 

0.207<± o.ooi) 

0.171 (±o.ooi) 

0.122(± o.oo2) 

— 

Gauss (7=0.1 

0.208(± 0.002) 

0.177<±o.ooi) 

0.126(± 0.002) 

— 

Gauss (7=0.5 

0.375<± o.ooi) 

0.284(± o.oo 2 ) 

0.198(± 0.002) 

— 

Gauss (7=1.0 

0.684(± o.oo2) 

0.600(± 0.005) 

0.435<± o.oo5) 

— 


Table 20: Results for (7,3)-hyperbolic map in euclidean cartesian space 
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(6/4) 

neighbourhood 

3 layers 

4 layers 

5 layers 

6 layers 

NN 

0.134<± o.ooi) 

0.047(± o.ooi) 

0.018(± 0.002) 

0.007(± 0.002) 

Gauss <7=0.05 

0.148 (±o.ooi) 

0.059(± o.ooi) 

0.016(± o.ooi) 

— 

Gauss <7=0.1 

0.147(± o.ooi) 

0.061<± o.ooi) 

0.018(± 0.003) 

— 

Gauss <7=0.5 

0.131(± 0.002) 

0.053<± o.ooi) 

0.018(± o.ooi) 

— 

Gauss <7=1.0 

0.168 (±0.002) 

0.060(± 0.002) 

0.020(± 0.005) 


Gauss <7=2.0 

0.440<± o.oio) 

0.165(± o.oio) 

— 



Table 21: Results for (6,4)-hyperbolic map in euclidean cartesian space 
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Appendix C Simulation data/results 



Table 22: Search runs for (6,3)-map with N=16 










Appendix D 
UML Diagrams 



FIGURE 43: Class diagram of distributions 
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Appendix D UML Diagrams 



Figure 44: Class diagram of spaces 
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Figure 45: Class diagram of analyzers 
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Appendix D UML Diagrams 
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